Guest author Robert J. Moore is the CEO of RJMetrics, a provider of business intelligence to online companies.
Much of the growth in software-as-a-service companies is being driven by a “land and expand” strategy: Tools first get used by individuals, then by small teams, and so on up the organizational hierarchy.
Eventually the company finds itself signing a large-scale deal, all without a competitive process or RFP. Companies like Slack and Dropbox craft their pricing models specifically to encourage this behavior—Slack knew exactly what it was doing when it decided to allow accounts to grow to unlimited users for free, and it was certainly an effective strategy to penetrate my company, RJMetrics. One day, some engineers were using it, then all the engineers, then the rest of the company, then we reached a point of no return and paying for it was pretty much our only option.
Enter The Shadow
This phenomenon is called Shadow IT: technology decisions getting made without input from, and sometimes without even awareness of, traditional IT organizations. Many IT leaders I talk to seem to think that the inmates are running the asylum, and are pushing back hard to regain control. Typically, this is being done in the name of security and compliance.
Users see it the other way around: they feel like they are finally throwing off the yoke of IT departments that have failed to innovate on their behalf.
Ultimately, it doesn’t matter if this trend is good or bad—either way, it’s happening. Technology no longer exists just in the workplace, and enterprise software is now not solely controlled by the IT hierarchy. Pandora is never going back in that box, so whether or not this state of the world is desirable just isn’t particularly relevant.
We didn’t mind paying—I think Slack is worth every penny. What’s interesting, though, is the position it put us in with regards to our data. All of a sudden, there’s this company that knows more about the way my team communicates than I do. That data seems like it could be quite useful—maybe I could even use it to understand and improve my business. But right now, Slack has that data and we don’t.
The Data Dilemma
Take this problem and multiply it by the number of SaaS tools your company is currently using: CRM, helpdesk, productivity, file storage, and on and on. They all know about your employees, your customers, your products—everything about your business— but you don’t have access to that data. Why is that?
Traditionally, software has been on-premise, and all of the data stores sitting behind the applications were directly accessible by the IT organizations maintaining them. Accessing the data was as easy as opening up a SQL terminal (which is to say, very easy).
But with applications deployed in the cloud, the data stores behind them became inaccessible. The only way to access data living within cloud applications is via the APIs exposed by application developers. And unlike SQL, a universal language for accessing data, the API for every application is different. This means that if your organization uses ten SaaS products, you need to integrate with ten different APIs to get that data, a nontrivial task.
What organizations really need is to get all of that data into a single, high-performance, SQL-based analytical database. Fortunately, there are plenty of great choices as to what that database should look like. HP Vertica, Amazon Redshift, Snowflake, and others are innovating aggressively in this space. In fact, Redshift is Amazon Web Services’ fastest-growing product in history.
Business owners of the past would shudder at the thought of their data warehouse living in the cloud, but today this has become less of a concern. The advantages of cloud deployment—managed services with low upfront investment—extend to data warehouses as well. And because of the larger trend towards cloud services, much of the modern company’s data is already there in the first place.
But that still leaves an open question: How do you transfer the data from all of the data sources where your data lives into this analytical database? This is, today, an unsolved problem, but it’s one that’s being actively churned on.
Today’s solutions fall into a few buckets:
- Legacy ETL tools (feature-rich but heavyweight, hard to use, and expensive)
- Home-grown solutions (unreliable, difficult to scale, and costly to maintain)
- Open source libraries (promising but immature; still require technical investment)
- SaaS products (still early in development)
This is a problem that businesses today are just waking up to. The transition to cloud services is happening in real time, and businesses are just beginning to realize how in the dark they are when it comes to analyzing this data.
Five years ago there were very few companies actively pursuing answers to these questions. Companies like LinkedIn, Facebook, and Spotify built their own data-processing engines that have matured into significant competitive advantages for these companies. Today, many more online businesses are following in their footsteps but need an answer that doesn’t require dozens of full-time engineers to build and maintain.
The transition to SaaS is a massive paradigm shift and companies are just learning to adapt. I have very little doubt that these companies will rely on SaaS products to build the infrastructure that ties together other SaaS products.
And thus does the snake eat its own tail.
Photo by Hamed Saber