IoT business models hinge on data analysis. From responding to emergency situations to discerning patterns in vast troves of historical data, a variety of approaches to derive insights are being made possible by IoT businesses. Despite the critical role that data plays in creating new business value, enterprises often do not have robust strategies to manage the data.

Problems with Data Growth and IoT

Data growth has been explosive in recent years with 63% of enterprises managing 50PB or more. Compounding the data management challenge is the frenetic pace of data growth annually – sometimes as high as 40-50%. This growth is creating a massive challenge in managing costs and effectively managing the data. In recent surveys, 51% of businesses admit their backup infrastructure (ex: tape backup) cannot keep up with the data growth and find that migration of data is painful. Lack of an effective back up and growth management strategy can be very expensive. Downtime costs expensive range between $50K-$5M per hour and can lead to incalculable reputation damages. More importantly the opportunity costs when data is unrecoverable or unavailable for use for analytics insights can lead to significant competitive disadvantages.

How Data Flows in an IoT Application

Before an effective data management strategy is devised, it is useful to understand how the data collected will be used. In a typical scenario (figure 1), sensors and control devices at the edge of the network continuously collect data and gateways transmit the data to on-premise or public clouds. The collected data needs to be quickly assessed and categorized in terms of frequency of access. For example, critical alarm data will be frequently accessed to ensure fixes are in place and site level actions are performed, while log/routine data collected for regulatory compliance need to be archived and will be accessed only by exception.

In recent times, with the emergence of machine learning and AI algorithms, archived data- often termed “cold data” – has become invaluable to train algorithms and devise new insights.  Data management stakeholders must balance the current and future utility of the data collected and assess what data will be in “cold” storage and which will be readily available.

Attend the Make Your Cloud Strategy Your Competitive Advantage webinar for insights and frameworks that will help you determine the right cloud strategies for your business.

Real World Usage of IoT Data

The type of data being collected is also an important factor in determining a storage strategy. Here, the IoT application and industry play a big role. Table 1 captures a few scenarios of data collection in four industry verticals and the kind of insights that can be driven by the data.

Solutions You Should Leverage for Data Storage Needs

Data storage paradigms in the form of blobs for unstructured and binary data, data lakes for big data analytics, files for sharing, tables for schema-less NoSQL data etc. are all important to account for, as decisions on data strategy are made.

Data Management Solutions

If the application generates mostly binary and unstructured data, finding the most effective way to save, archive and backup data blobs becomes critical. Similarly, if analytics are correlating structured and unstructured data, having the right approach for data lakes is needed. 

Another key element in determining a data management strategy is the rate of data acquisition and consequently the rate of growth of the collected data. Highly frequent small-size data additions (ex: text data) can be as problematic as sporadic large volume data additions. Further, HD image data feeds or graphics data may be particularly expensive to store and constantly archive.  Trading off the business need with the frequency of data collection, the amount of historical data needed, and the cost of data storage is essential to arrive at a sustainable data management strategy.

Data Security Solutions

With the nature of data and use of data determined, security of the data becomes forefront. Due to the value and compliance aspects of data becoming prominent, it is essential to ensure that the data is secure. Whether choosing a DIY option or a commercial option, securing data is paramount as unauthorized access to the data can have costly consequences as witnessed by several high-profile breaches at FedEx, Target, etc. Similarly, not losing the data when breakdowns or outages happen is also crucial. Regular tests of backup and recovery process is, of course, best practice.

On Site-Storage vs. Public Cloud

Perhaps the most important consideration in the cloud-era is the decision to host the data on-premises or in a public cloud. While advantages exist for both only a true accounting of the complexity, control premiums, the total cost of ownership and any inherent legacy switching costs can yield an objective assessment to drive decisions. On-premises infrastructure gives organizations more control but requires skills, resources, and strong processes for success. More important, the breadth of capability required to run a modern enterprise datacenter can be significant. Facility management, electrical and power infrastructure, IT procurement, management personnel and 24/7 support for users can be costly. Nevertheless, there may be situations where infrastructure control, security or proprietary considerations can tilt the economics in favor of on-premise infrastructure.

Off-premise public clouds or co-located Infrastructure-as-a-Service (IaaS) models treat compute storage as service that organizations can procure on demand. This creates great flexibility and focuses IT resources on the value-add and takes away the hassle of managing infrastructure. Robust regimes for back-up, restore and on-demand scalability provide a growing enterprise with scalable data management infrastructure.

Increasingly, hybrid environments that combine on-premise and public cloud infrastructure are popular as they overcome stakeholder concerns related to control and proprietary issues and offer flexibility and on-demand scale. Stakeholders also like the security and backup options that cloud vendors provide.

For an effective decision, a thorough total cost of ownership exercise is invaluable. Outlining the various aspects of infrastructure management, opportunity costs related to scale and flexibility are important as stakeholders evaluate options.

Summary

An effective IoT Cloud data storage strategy requires consideration of the following elements

1. How the collected data will be used and how frequently the data will be accessed

2. The type, nature, volume, and velocity of data being collected

3. How the data will be secured, backed-up and recovered

4. The most efficient storage strategy: on-premise, cloud or hybrid infrastructure

5. The most effective ways to scale storage as needs inevitably will change

6. The Total Cost of Ownership (TCO)

For further insights and frameworks to help you determine the right cloud storage strategy for your business, attend the “Make Your Cloud Strategy Your Competitive Advantage” webinar, hosted by experts at Arrow, Microsoft and ReadWrite Labs.

Scott Chmiel

Scott Chmiel

Microsoft Business Development Manager at Arrow Electronics

22+ years experience in Embedded Hardware & Solutions, with 13 years at Arrow spanning multiple roles. Scott has a BSME degree from San Diego State University. He is based in Houston, Texas.