Editor's note: This is the second in a two-part series on the advantages that cloud computing brings to the machine-to-machine space. It was first published as a white paper by Ken Fromm. Fromm is VP of Business Development at Appoxy, a Web app development company building high scale applications on Amazon Web Services. He can be found on Twitter at @frommww.

Once data is in the cloud, it can be syndicated - made accessible to other processes - in very simple and transparent ways. The use of of REST APIs and JSON or XML data structures, combined with dynamic language data support, allows data to be accessed, processed and recombined in flexible and decentralized ways. A management console, for example, can set up specific processes to watch ranges of sensors and perform operations specific data sets. These processes can be launched and run on any server at any time, controlled via a set schedule or initiated in response to other signals.

Data can also propagate easily throughout a system so that it can be used by multiple processes and parties. Much like Twitter streams that make use simple subscription/asymmetric follow approach, M2M data streams can be exposed to credentialed entities by a similar loosely coupled and asymmetric subscription process.

The model is essentially a simple data bus containing flexible data formats, which other processes can access without need for a formal agreement. Data sets, alarms and triggers can move throughout the system much like status and other federated signals are flowing within Consumer Web 2.0.

Low-Cost Sensors and Reliable Data Transmission

Because screen interfaces can be separated from data collection, inexpensive and remote devices can be used as the interfaces. Separating views from data from application logic means that widely available languages and tools can be used, making it easier and faster to add new features and rapidly improve interfaces.

The cloud benefits outlined here are predicated on low-cost sensors being available (and low-power for a number of uses) as well as inexpensive and high availability data transmission.

The first assumption is a pretty solid bet now and will only become more true in the next year or two. And with mobile companies looking at M2M as important sectors and wireless IP access extending out in both coverage and simplicity (and bluetooth to mobile for some instances), being able to reliably and pervasively send data to the Web is also becoming more of a certainty.

Flexible and Agile App Development

Another core benefit of the cloud is development speed and agility. Because screen interfaces can be separated from data collection, inexpensive and remote devices can be used as the interfaces. Separating views from data from application logic means that widely available languages and tools can be used, making it easier and faster to add new features and rapidly improve interfaces. Device monitoring and control can take on all the features, functions and capabilities that Web apps and browsers provide without having to have developers learn special languages or use obscure device SDKs.

Developing these applications can be done quickly by leveraging popular dynamic languages such as Ruby on Rails, Python and Java. Ruby on Rails (Ruby is the language; Rails is an application framework) offers many advantages when it comes to developing Web applications:

  • simple dynamic object-oriented language
  • built-in Web application framework
  • transparent model-view-controller architecture
  • simple connections between applications and databases
  • large third-party code libraries
  • vibrant developer community

Here's Mark Benioff, CEO of Salesforce explaining his purchase of Heroku, a Ruby on Rails cloud development platform from ComputerWeekly in December 2010:

"Ruby is the language of Cloud 2 [applications for real-time mobile and social platforms]. Developers love Ruby. It's a huge advancement. It offers rapid development, productive programming, mobile and social apps and massive scale. We could move the whole industry to Ruby on Rails."

Developing applications in the cloud provides added speed and agility because of reduced development cycles. Projects can be developed quickly with small teams. Cloud-based services and code libraries can be used to so that teams develop only what is core to their application.

Adding charts and graphs to an application is a matter of including a code library or signing up for third party service and connecting to it via REST APIs. Adding geolocation capabilities involves a similar process. In this way, new capabilities can be added quickly and current capabilities extended without having to develop entire stacks of functionality not core to a company's competencies.

Here's VMWare's CEO Paul Maritz on the advantages of programming frameworks:

"Developers are moving to Django and Rails. Developers like to focus on what's important to them. Open frameworks are the foundation for new enterprise application development going forward. By and large developers no longer write windows or Linux apps. Rails developers don't care about the OS - they're more interested in data models and how to construct the UI."

Monitoring and control dashboards and data visualization are critical to creating effective M2M applications. Having dynamic languages and frameworks that facilitate rapid development and rapid iteration means companies can move quickly to roll out new capabilities and respond rapidly to customer needs.

Circuit photo by pawel_231; cloud photo by Rybson.

Big Data Processing

Having data stored and available in the cloud makes it far easier to analyze. Distributing data on a sensor-by-sensor basis means simpler access for individual nodes or collections of nodes. Data can be distributed quickly to different entities much like the way Facebook photos or Twitter posts are distributed to each friend or follower. Flatter data formats means applications can be simpler and take less to develop. But big challenges arise when trying to analyze a massively growing data streams.

Given the large amount of incoming data and the wide range of queries on the data, significant effort needs to go into parsing and processing it so that queries are as responsive as possible. This means use cases for managing data that include optimizing near-term storage for immediate queries, setting up frameworks for analytics across massive data sets and deciding on what data to archive and in what raw, processed and derivative forms.

With Plaster Networks, for example, Appoxy keeps near-term data available for queries on the current performance of adapters. Appoxy also structured the data model and data flow to quickly respond to queries based on common time frames - performance for the last day, week, month, for example - and node groupings (adapter to adapter and adapter to device). Detecting and flagging unusual performance within a network is also an inherent M2M pattern architecture.

In addition to near-term data handling challenges comes issues with analyzing large data sets. An example is running queries extending across an entire set of nodes in a system to obtain insight on aggregate behavior and device performance. There could be thousands of sensors each with hundred of thousands of data points. This is no unlike data processing queries within social network sites or other large consumer websites with millions and hundreds of millions of users.

Just as NoSQL storage options are largely derived from consumer Web needs, big-data data architectures and analytical methods are also making their way from these same efforts. Hadoop, for example, is a set of frameworks used for distributed data analysis. One of its projects, MapReduce, is a basis approach pioneered by Google to process data across many clusters and then propagate the findings up from the nodes.

Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.

A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system. The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.

Just as smart phone service companies and cloud industries have formed, overtaking established mobile and IT leaders, new M2M companies will arise to supply important cloud-based components to this emerging set of smart-enabled devices, vehicles and machines.

Using these big data approaches, large sets of M2M data can be analyzed across an elastic and scalable cloud infrastructure. The virtual nature of the cloud along with distributed data models means that jobs can be queued up by the thousands and run in parallel if needed.

Once results have been reduced and then answers assembled, they'll need effective visualization tools in order to have meaning. This requires charts, graphs, tables and maps and might require a similar level of optimization as handling the analysis at the start of the process. The good part here is that the elastic nature of the cloud processing and the flexibility and agility of cloud development means that these interfaces can be created and extended and control options added in the same manner and with the same speed as other parts of applications.

Summary

M2M applications across many devices and industries will have many of the same patterns regarding data collection, processing, visualizing and control. Just as there are patterns to social applications and Web 2.0 cloud services, there are patterns to M2M applications. Smart devices for building efficiency have similar needs as do medical devices, mining and agriculture sensors, truck and automobile diagnostics and most other electronic devices and machines.

The particular types of sensors used, the data collected, the way it's transmitted to the cloud, the views and reports generated and the actions triggered may be different, but many core application needs, process flows and data approaches will be the same.

The advantages of the cloud -- in data storage and application development alone -- present a significant inflection point in the cost of operation and the speed of development for M2M applications. Sensor, device and equipment makers are only just beginning to leverage these capabilities but not near there levels where they could be.

And the opportunities aren't just limited to M2M applications surrounding devices. There are big opportunities to create new M2M platforms and platform services. Just as smart phone service companies and cloud industries have formed, overtaking established mobile and IT leaders, new M2M companies will arise to supply important cloud-based components to this emerging set of smart-enabled devices, vehicles and machines.

The mobile and cloud industries have made people acutely aware the power that open platforms in combination with rich development tools and vibrant support community have in regards to technology adoption (see Apple iPhone and Google Android vs. RIM, Palm, Symbian and Microsoft). These same insights - and the strategic value of creating and leveraging cloud-based platforms - can be employed to service makers of devices and machines.

Great products can no longer be great products in isolation. They'll need to be cloud aware in order to be viable in this changing ecosystem. An ecosystem where data matters and monitoring and control of devices can exist anywhere. Cloud computing and the way it impacts M2M applications will touch and transform, every industry in the same manner as has the Internet, email and the Web.