Apache Mesos is a cluster manager that simplifies the complexity of running tasks on a shared pool of servers. Docker is a lightweight container for deploying packaged services, similar in concept to a virtual machine, but without the overhead.
Mesos added support for Docker in the 0.20.0 release and subsequently fixed some fairly large limitations in the following 0.20.1 patch release. The combination of Mesos + Docker provides a very powerful platform for deploying applications and services in a clustered environment.
This tutorial will explain how to use Mesos 0.20.1 and Docker 1.2.0 to write a simple Mesos framework in Java that will start some containers.
A typical Mesos deployment consists of one or more servers running the mesos-master (one live instance and one or more standby instances) and a cluster of servers running the mesos-slave component. The slaves register with the master and offer “resources” i.e. capacity to be able to run tasks. The master then interacts with the deployed frameworks to pass those resource offers on and to receive instructions to run tasks and then delegates those instructions back to the slaves.
Multiple frameworks can be deployed concurrently and share the resources available in the cluster. For example, Apache Spark and Cassandra both have Mesos frameworks available, allowing them both to be deployed on the same cluster.
A framework consists of a scheduler and optionally one or more executors. The scheduler connects to the mesos-master and accepts or rejects resource offers from slaves and then provides instructions on what tasks to execute on those slaves.
Mesos has default executors for running shell scripts and, since the 0.20.0 release, for launching docker containers. It is also possible to write executors in Java and other languages. In this case, the executor binary (jar files in the case of Java) must be available from a central resource such as HDFS so that the slaves can download them. Of course, with the introduction of Docker, there is now the possibility of packaging up the executors directly inside a Docker image, making the deployment process much simpler.
This tutorial will demonstrate how to develop a framework with a scheduler that will start Docker containers on one or more slaves. There is no need to develop an executor in this case since the default Mesos executor will be used.
The full source code is available in github: https://github.com/codefutures/mesos-docker-tutorial
The main class to implement is ExampleScheduler which will implement the Scheduler interface.
void resourceOffers(org.apache.mesos.SchedulerDriver schedulerDriver, java.util.List list); void statusUpdate(org.apache.mesos.SchedulerDriver schedulerDriver, org.apache.mesos.Protos.TaskStatus taskStatus);
The resourceOffers() method will be called whenever there are slaves running with available resource (capacity to run jobs). The scheduler can then decide whether to accept any of these offers and schedule any tasks to be run.
The statusUpdate() method will be called to notify the scheduler of the status of the tasks that were scheduled. The example code will look at the task status (TASK_RUNNING, TASK_FAILED, TASK_FINISHED) to keep track of how many containers are running.
In this tutorial the scheduler will attempt to maintain a certain number of running tasks. The following instance variables are used to keep track of pending task IDs and running task IDs.
/** List of pending tasks. */ private final List pendingInstances = new ArrayList<>(); /** List of running tasks. */ private final List runningInstances = new ArrayList<>();
An AtomicLong is used generate sequential task IDs:
/** Task ID generator. */ private final AtomicInteger taskIDGenerator = new AtomicInteger();
The main flow in the resourceOffers() method is to iterate over the list of offers received and decide whether to launch any tasks, so the main flow looks like this:
@Override public void resourceOffers(SchedulerDriver schedulerDriver, List offers) { logger.info("resourceOffers() with {} offers", offers.size()); for (Protos.Offer offer : offers) { List tasks = new ArrayList<>(); if (runningInstances.size() + pendingInstances.size() < desiredInstances) { // generate a unique task ID Protos.TaskID taskId = Protos.TaskID.newBuilder() .setValue(Integer.toString(taskIDGenerator.incrementAndGet())).build(); logger.info("Launching task {}", taskId.getValue()); pendingInstances.add(taskId.getValue()); // docker image info Protos.ContainerInfo.DockerInfo.Builder dockerInfoBuilder = Protos.ContainerInfo.DockerInfo.newBuilder(); dockerInfoBuilder.setImage(imageName); dockerInfoBuilder.setNetwork(Protos.ContainerInfo.DockerInfo.Network.BRIDGE); // container info Protos.ContainerInfo.Builder containerInfoBuilder = Protos.ContainerInfo.newBuilder(); containerInfoBuilder.setType(Protos.ContainerInfo.Type.DOCKER); containerInfoBuilder.setDocker(dockerInfoBuilder.build()); // create task to run Protos.TaskInfo task = Protos.TaskInfo.newBuilder() .setName("task " + taskId.getValue()) .setTaskId(taskId) .setSlaveId(offer.getSlaveId()) .addResources(Protos.Resource.newBuilder() .setName("cpus") .setType(Protos.Value.Type.SCALAR) .setScalar(Protos.Value.Scalar.newBuilder().setValue(1))) .addResources(Protos.Resource.newBuilder() .setName("mem") .setType(Protos.Value.Type.SCALAR) .setScalar(Protos.Value.Scalar.newBuilder().setValue(128))) .setContainer(containerInfoBuilder) .setCommand(Protos.CommandInfo.newBuilder().setShell(false)) .build(); tasks.add(task); } Protos.Filters filters = Protos.Filters.newBuilder().setRefuseSeconds(1).build(); schedulerDriver.launchTasks(offer.getId(), tasks, filters); } }
As you can see, this method builds up a task definition. Let’s walk through this in more detail.
Since the scheduler is running Docker tasks, the first step is to define the Docker image to be used. The docker image name must be specified, and optional configuration items include network configuration and port mappings. In this example, bridge networking is used, so that each container gets its own IP address, which is the default behaviour when using Docker.
Protos.ContainerInfo.DockerInfo.Builder dockerInfoBuilder = Protos.ContainerInfo.DockerInfo.newBuilder(); dockerInfoBuilder.setImage(imageName); dockerInfoBuilder.setNetwork(Protos.ContainerInfo.DockerInfo.Network.BRIDGE);
Next, the container information must be specified, mainly providing a reference to the Docker image definition.
Protos.ContainerInfo.Builder containerInfoBuilder = Protos.ContainerInfo.newBuilder(); containerInfoBuilder.setType(Protos.ContainerInfo.Type.DOCKER); containerInfoBuilder.setDocker(dockerInfoBuilder.build());
Finally, the task must be defined.
// create task to run Protos.TaskInfo task = Protos.TaskInfo.newBuilder() .setName("task " + taskId.getValue()) .setTaskId(taskId) .setSlaveId(offer.getSlaveId()) .addResources(Protos.Resource.newBuilder() .setName("cpus") .setType(Protos.Value.Type.SCALAR) .setScalar(Protos.Value.Scalar.newBuilder().setValue(1))) .addResources(Protos.Resource.newBuilder() .setName("mem") .setType(Protos.Value.Type.SCALAR) .setScalar(Protos.Value.Scalar.newBuilder().setValue(128))) .setContainer(containerInfoBuilder) .setCommand(Protos.CommandInfo.newBuilder().setShell(false)) .build();
The task definition specifies the amount of resource needed (1 CPU and 128 MB RAM) and also specifies a reference to the Docker container information and the command to run (in this case the command is effectively set to NULL, so that the default Docker image entry point will be used).
.setContainer(containerInfoBuilder) .setCommand(Protos.CommandInfo.newBuilder().setShell(false))
The statusUpdate() method will simply update the pendingInstances and runningInstances lists based on the task status.
@Override public void statusUpdate(SchedulerDriver driver, Protos.TaskStatus taskStatus) { final String taskId = taskStatus.getTaskId().getValue(); logger.info("statusUpdate() task {} is in state {}", taskId, taskStatus.getState()); switch (taskStatus.getState()) { case TASK_RUNNING: pendingInstances.remove(taskId); runningInstances.add(taskId); break; case TASK_FAILED: case TASK_FINISHED: pendingInstances.remove(taskId); runningInstances.remove(taskId); break; } logger.info("Number of instances: pending={}, running={}", pendingInstances.size(), runningInstances.size()); }
The scheduler will be created by the framework, which is basically just the main() method that is invoked from the command line. The framework creates the scheduler and registers it with Mesos:
FrameworkInfo.Builder frameworkBuilder = FrameworkInfo.newBuilder() .setName("CodeFuturesExampleFramework") .setUser("") // Have Mesos fill in the current user. .setFailoverTimeout(frameworkFailoverTimeout); // timeout in seconds final Scheduler scheduler = new ExampleScheduler( imageName, totalTasks ); MesosSchedulerDriver driver = new MesosSchedulerDriver(scheduler, frameworkBuilder.build(), masterIpAndPort); driver.run();
These instructions are for running everything on a single node, but can easily be adapted to run on multiple nodes.
Running Mesos master and slave:
nohup mesos-master --ip=127.0.0.1 --work_dir=/tmp >mesos-master.log 2>&1 & nohup mesos-slave --master=127.0.0.1:5050 --containerizers=docker,mesos >mesos-slave.log 2>&1 &
Checkout the code ()from the github repo:
git clone https://github.com/codefutures/mesos-docker-tutorial.git cd mesos-docker-tutorial mvn package
Run the code:
Launch the framework to run 2 instances of the fedora/apache image.
java -classpath target/cf-tutorial-mesos-docker-1.0-SNAPSHOT-jar-with-dependencies.jar com.codefutures.tutorial.mesos.docker.ExampleFramework 127.0.0.1:5050 fedora/apache 2
The framework should output logging like this:
Running ‘docker ps’ should confirm that the containers have been launched:
1. Write a Mesos Framework
2. Use the new Docker support in Mesos
Get the full source code, available in github: https://github.com/codefutures/mesos-docker-tutorial
Image by Damien Gabrielson