The world’s best cloud and big data software isn’t for sale. Instead you download it for free.
You won’t see Oracle, IBM, HP, or any of the erstwhile enterprise IT giants developing it, either. In fact, this incredibly rich treasure trove of software isn’t being developed by software vendors at all. It’s the Googles and Facebooks that are releasing it.
Well, add Yelp to that list.
Yelp, a quiet hero in open source, just released its internal PaaS (“platform as a service”) to the open source community called, cleverly, PaaSTA. The coding genius behind PaaSTA at Yelp is Kyle Anderson, a site reliability engineer for the company who has been tinkering with servers for more than a decade. He and his team at Yelp worked on PaaSTA for 18 months and today it runs more than 100 production applications at Yelp.
I sat down with Anderson to plumb the details of this impressive contribution.
Not Just Any Old PaaS
Open source is nothing new for Yelp. Indeed, it’s already a leading contributor to more than 58 other open-source projects.
But this is different.
This is Yelp giving away the secret sauce that powers its computing infrastructure, which has to scale to support user-generated reviews of more than 50 million local businesses in 32 countries. PaaSTA is Yelp’s internal platform for automating the deployment and management of services running inside Docker containers.
PaaSTA relies on three core components of Mesosphere’s Datacenter Operating System (DCOS), all of which are open source: Apache Mesos, Marathon and Chronos. Mesos handles the work of actually deploying containers onto servers, while Marathon (which was developed by Mesosphere) makes sure long-running PaaStA services re-launch should something crash. Chronos schedules containers to launch at preordained times for recurring tasks or batch processing.
ReadWrite: What is PaaSTA? Why did you go this route instead of using a commercial PaaS? Why are you giving it to the community as open source?
Kyle Anderson: PaaSTA is an opinionated PaaS built on existing, opinionated, open-source tools like Mesos and Marathon. It gives developers a coherent workflow for going from a raw idea, to a git repo, to a monitored service in production.
Commercial PaaS solutions are often less flexible than in-house solutions. In some sense an in-house built PaaS organically grows to meet an organization’s particular needs. Sometimes this is good, and can lead to good cohesion with your particular environment. Sometimes this is bad, and leads to crufty unmaintained spaghetti.
With PaaSTA, we needed something that was flexible enough to allow developers to make the transition from our legacy platform, and give us room to grow and remain flexible in the long term. PaaSTA is the outcome of this effort.
We are sharing PaaSTA with the community because we think it’s pretty cool, and we are proud of it! We want others to be able to benefit from what we’ve worked hard to create. We were only able to build such a cool PaaS by standing on the shoulders of some open-source giants.
PaaSTA In Action
RW: Describe how Yelp uses PaaSTA: scale, services, workloads.
KA: Yelp uses PaaSTA as the default platform for all new services, and for legacy services that are moving over at a rapid pace. Scale on this platform is a very tractable problem, thanks to the hard work that has already been put into scaling Mesos. If we need to get more hardware, we can do that, or if we need to burst we can scale up our ASGs in AWS.
But at the same time, scaling the “number” of services is also easy in PaaSTA.
This is because in PaaSTA, a service is just a git repo and a couple of config files describing how it should be run and monitored. This is super powerful for organizations with lots of teams. No team needs to be “blocked” on getting a new service running—the barrier to entry is very low for developers.
Splitting out applications into individual services is not a new technique, despite the “microservices” hype. It is the natural progression of things as organizations grow up and scale as an org. The important part is having a platform that can grow up with you. For Yelp, PaaSTA is that “growing up.”
PaaSTA right now just handles “stateless” workloads, either long running or scheduled. In our opinion, the storage primitives are a tad immature for running production stateful workloads (like MySQL, Cassandra, etc.). For those workloads, we use traditional infrastructure building tools. Luckily, that works great for Yelp! Most services that our developers develop are stateless.
On The Shoulders Of Mesos
RW: How does Apache Mesos fit into your strategy? Why did you elect to run your PaaS on Mesos?
KA: We chose Mesos as a proven technology for building scalable distributed systems in real-life production settings.
We also chose to build on it because of its “opinionated” nature. Mesos “does one thing well,” and that is the resource management of clusters. It doesn’t actually do the scheduling and deciding what to run, it leaves that up to frameworks.
We really like this model; it means we can start with relatively simple frameworks like Marathon and Chronos, but we can expand with our own custom frameworks. For example, we already use a custom Mesos framework called “Seagull” to handle running large test suites across a large number of Mesos slaves using Amazon Spot instances.
Another reason we believe that Mesos is a good foundation upon which to build a PaaS is the fact that it has pluggable executors and containerizers. That means that we are not locked into say, Docker. Docker is cool, but we don’t want to be locked into one particular container implementation.
I’m really excited for pluggable “containerizers.” So far, we limit things to particular CPU shares and memory, but wouldn’t it be cool if we could go up a level and start talking about costs per hour to run your service? Mesos doesn’t care what the metrics are, it just see ints, floats, and sets.
I look forward to that day when we can see compute as a raw utility, and I believe that Mesos is going to help empower that shift in thinking.
Go Custom Or Commercial?
RW: What has been the payoff? The ROI? What are the benefits of your approach over how you did this before or if you had gone with an off-the-shelf PaaS from a vendor? How is your PaaS better for Yelp than alternatives you considered?
KA: The payoff is huge. The most obvious return comes from the lower barrier to writing new services.
It used to be that developers would have “a dream deferred” due to the large overhead of provisioning a new service, getting resources, adding monitoring, etc. Now it is so easy to launch services, we find that developers use the platform for launching experiments during our frequent hackathons.
It is a good sign that we build something good when developers choose to use it, even when they are free to do anything (during the hackathon).
Longer term, we expect to see better resource utilization out of our hardware thanks to real automated scheduling (as opposed to manual partitioning). We also look forward to fine-grained auto-scaling and taking advantage of opportunistic low cost servers (AWS Spot Fleet / Spot instances). We are already leveraging this in dev environments for bursty workloads.
From an organizational perspective, the biggest gain from building your own PaaS is the chance to reuse your existing engineering work on the components. For Yelp, this means we can reuse our existing service discovery mechanisms, monitoring tools, and docker images, to allow service authors to incrementally grow into PaaSTA.
Often with other commercial PaaS’s it is a much more significant migration.
By using existing open-source components with your PaaS, you can use them for auxiliary applications. For example, in PaaSTA we use Sensu to do the monitoring and alerting for services, but we can also use Sensu in a more generic way to monitor more conventional things like switches and routers.
A more turnkey PaaS from a vendor may have a tightly integrated monitoring solution with their product, but it is unlikely you can use that same monitoring tool for other things. With PaaSTA, the value of the whole is truly greater than the sum of its parts.
Lead photo by George Thomas