This post was written by Derrick Harris and Michael Hausenblas
After years of IT-industry debate over its merits, a viable, a enterprise-ready private cloud architecture is finally here. And unlike other approaches and technologies introduced over the past decade, this one has already proven itself inside some of the world’s largest companies and most cutting-edge adopters of technology.
Importantly, we’re not talking about Infrastructure as a Service technology. That approach has been tried too many times to count by now, and it has yet to really pan out. Startups have come and gone, and large vendors have chased their tails — including on projects such as OpenStack — trying to no avail to make private IaaS a scalable business.
The problem is that IaaS isn’t the end game for most users of cloud computing — at least if they have their choice. Operational efficiency and scalable infrastructure are just means to the end that is developer productivity and business agility. For CIOs, it can be difficult to see the payoff in a heavy engineering project that only gets them halfway there.
That’s why the future of private cloud computing is built on another open source platform, Apache Mesos, and looks a lot more like Platform as a Service. It still delivers (and then some) on the operational efficiency often touted as a reason for deploying private clouds, but this Mesos-centric style of private cloud computing really works because it delivers on the faster, simpler and more flexible developer experience that has always been at the core of the cloud.
But don’t take my word for it. Take Gartner’s word. And take the words of Twitter, Apple, Yelp, Hubspot, Autodesk, eBay, Ericsson, Capgemini and more large companies that have built their own fully functional and mission-critical private PaaS systems on top of Mesos.
It’s arguable that focusing on replicating Infrastructure-as-a-Service cloud platforms such as AWS was the wrong idea all along. After all, AWS initially caught on because it was accessible in minutes with a credit card, not because it presented the best or easiest way to do deploy applications.
Here’s what Gartner VP and Distinguished Analyst Thomas Bittman had to say about the idea of private PaaS in an October 2014 report about the biggest mistakes companies make when adopting private clouds:
Although the vast majority of private clouds are IaaS, using virtual machines (VMs) as the unit of work, the value of raw IaaS is very limited. Even public cloud IaaS providers already include significant features on top of their IaaS offerings, including tools for developers, tools to provision and manage what’s ‘inside’ the VM, and more and more platform as a service (PaaS) services.”
…
Some applications might be better served by being rewritten to a PaaS layer, by requiring interoperability with a public cloud PaaS offering, or by being obtained in a SaaS model from an external provider. Although private PaaS is still relatively uncommon, technologies to enable private PaaS will mature — especially for hybrid modes of cloud.”
In fact, they’re maturing right now; it was just a matter of time. Because it has always been developers who drove adoption of cloud computing. They were the first users of AWS because it helped them skirt IT, they were the first users of Platform as a Service (early one such as Heroku, for example) because it helped them avoid the complexities of AWS, and they were the first users of Software as a Service tools like New Relic because it helped them monitor their newly launched cloud apps.
And as Mårten Mickos, the former CEO of Eucalyptus Systems and MySQL before that, put it aptly (and succinctly) earlier this year:
Developers don’t ask for a server any more. They don’t even ask for a LAMP stack. They ask for APIs.
— Mårten Mickos (@martenmickos)
May 29, 2015
Probably some containers, too.
Essentially, developers want to be able create and deploy new apps as part of a rapid code-deploy-test cycle. Continuous delivery, continuous integration and microservices don’t work when you’re waiting for IT to provision golden images. And, frankly, developers probably don’t care much where they’re deploying their apps or services, just as long as doing so is relatively easy.
This is where IT and operations come into the picture and can actually make a very big difference. By choosing the right software stack (let’s say Mesos and Docker, at least) smart CIOs can meet business-level requirements such as higher resource utilization, lower power bills and less downtime, while still providing the fast, flexible platform that developers demand.
For many Mesos users, including the list of publicly traded companies listed above, private PaaS is more than an emerging technology — it’s already here. Mesos provides the server-level scheduling and general resource-management capabilities and abstractions, while higher-level tools such as Marathon, Docker and some homemade (and often open source) tooling provide the developer experience.
Almost to a company, the PaaS-on-Mesos architecture has allowed users to dramatically increase the ease and speed at which developers deploy apps. Many users have been able to embrace microservices and even experiment with new big data frameworks thanks to Mesos, which schedules workloads based on the actual resources they need and supports nearly any type of workload on the same cluster.
Several PaaS frameworks that have been built by large companies in order to run on Mesos (and, by extension, the DCOS) have since been open sourced. These include:
Marathon: Created and supported by Mesosphere and pre-installed as part of our Datacenter Operating System (DCOS), product, Marathon was designed to run long-running services and often serves as the deployment environment for Docker containers in PaaS environments. Marathon handles resource allocation and availability for services running on it.
Apache Aurora: Aurora was originally developed at Twitter — probably the world’s largest Mesos user, at tens of thousands of nodes within its datacenters — as a PaaS-type layer, and now manages resources many of the company’s cores services. Like Marathon, Aurora is responsible for ensuring jobs keep running even in the face of server failures.
Singularity: Singularity was developed by HubSpot after it re-architected its large footprint of AWS images to be managed by Mesos. HubSpot calls Singularity a “PaaS in a box,” meaning it provides enough abstraction that need not even be familiar with Mesos in order to launch jobs.
**DEIS:** Engine Yard has been a leading public PaaS provider for years, and it recently revamped its core platform to provide strong support for private Docker-based platforms via DEIS. It lets users deploy private PaaS environments that mirror their public ones. The DEIS project began integrating the technology with Mesos earlier this year.
Apollo: This is a particularly interesting project, as it was developed by leading consulting firm and systems integrator Capgemini to serve some of the company’s large-enterprise clients. Apollo utilizes a number of additional components, including Terraform and Packer, to let users build both private IaaS and private PaaS environments.
Ochothon: Computer-aided design specialist Autodesk originally created a container-orchestration layer called Ochopod in order to simplify its internal IT processes, and Ochothon is a version designed specifically to run on top of Marathon. As the company moves toward a Mesos-centric infrastructure, Ochothon provides a set of high-level capabilities for automating how containers within a cluster interact with one another.
Mesosphere has taken open source support a step further by adding DCOS integration with additional container-orchestration and PaaS systems that weren’t developed with Mesos in mind but still provide a lot of functionality. These include the Google-led Kubernetes project, Docker’s Swarm, Red Hat’sOpenShift and, eventually, Cloud Foundry.
A whole other collection of homegrown PaaS-on-Mesos systems have been created over the past couple years that have not been open sourced. Among the companies discussing theirs publicly are:
Yelp: Yelp constructed a Docker-based microservices architecture, called PaaSTa, on top of Marathon. It allows automatic deployment of Docker containers and migration of services across both in-house hardware and AWS machine images. PaaSta and related efforts have been critical to Yelp’s continuous deployment environment, and the company currently launches more than a million containers per day as part of its code-testing process.
Apple: Apple built a custom Mesos scheduler called J.A.R.V.I.S. (Just A Rather Very Intelligent Scheduler) as a backend to power its entire Siri application. The Mesos cluster spans thousands of nodes and J.A.R.V.I.S. makes it easier for developers to deploy the services that comprise Siri.
eBay: In eBay’s case, the goal was to migrate from the existing (dedicated VM-based) continuous integration solution to a Mesos-based one. In its setup, eBay provides each developer a Jenkins instance, using Mesos and Marathon, and Mesos actually runs on top of OpenStack instances.
Ericsson: The telecommunications giant is using Mesos and Marathon as building blocks for a PaaS system that can power data analytics and enforce SLAs across its thousands of datacenters globally.
However, while all of the aforementioned examples show the promise of what’s possible with Mesos, the reality is that not every company will have the resources or the desire to build mission-critical systems using pure open source — much less build them from scratch.
The Mesosphere Datacenter Operating System (DCOS) makes it relatively easy to build a private PaaS by providing all the necessary components and primitives to build a PaaS, either on-premises or in the public cloud. The DCOS provides all the functionality of open source Mesos, plus major improvements in terms of UI/UX, SDKs and commercial support.
A high-level architecture like this:
The IaaS layer in this case is strictly about provisioning and managing machines. They could be physical machines, virtual machines or public cloud instances, as long as they’re running Linux. It’s the DCOS layer that provides resource abstraction by aggregating and managing memory, CPU cycles, networking resources and storage across the cluster of machines. The DCOS delivers the principal building blocks to building distributed applications in a fault tolerant and scalable manner.
The PaaS layer introduces the notion of applications, application groups and services, building on the resource abstraction of the DCOS layer. The default PaaS service in the DCOS is called Marathon, which is an open source technology developed by Mesosphere. As Yelp and others have proven, however, Marathon can also serve as the basis for an even more-customized layer — often involving specific methods for configuring containers than will run on Marathon.
In addition to the general advantages of PaaS, the DCOS also makes it simple to deploy hybrid cloud architectures — meaning your private PaaS can run on the public cloud. Workload portability is a core guarantee of the DCOS, so moving parts or all of the application from on-premises environments into public cloud environments (or the other way round) is straightforward. Resources are abstracted the same way, the user experience remains the same and no code changes are required.
The reality of business today is that business demands are changing fast, which means the demands on IT infrastructure and developers are changing fast, too. Companies have been looking to private cloud as the answer to helping the latter keep up with the former — and the private cloud is finally starting to respond. It might not look like we all thought it would 6 years ago, but that’s just fine.
Because this time, it works.