Kubenetes这个开源项目是用来:
1. 像管理单一系统一样管理一个容器集群
2. 在多主机之间管理和启动docker容器,提供容器的主机代管, 服务的探索和复制控制
Kubenetes是由Google开创,而由Microsoft, Redhat, IBM, Docker共同支持。
Google用容器技术已经10年多,每周会启动20亿个容器。这大概是Kubernetes会出现的直接原因。
Kubernets项目主要解决两个问题
1. 一旦我们有了容器,怎么在多docker主机上批量复制和启动容器, 并做好负载均衡。
Kubernets项目封装了一个上层的API去定义容器是怎么分组的,允许定义容器池, 负载均衡等。
Kubernetes还处于已经比较初级的阶段。
Kubernets的架构是有一个主机和许多从机一起决定的。
这个命令行工具连接了这个master的API的入口。这个API入口管理和协调所有的从机及其docker的主机(用于接受master的指示并启动container)
master: 提供Kubernetes API服务的机器。目前仅支持一个master。多个Master尚未实现。
Minion: 每个提供kubelet服务的docker host, 接受master发出的指令,管理这些host去启动containers.
Pod: 定义了一组相互关联的部署在同一个minion的containers, 例如,一个应用的数据库container和应用container则可以理解为一个pod,这两个container可以部署在同一个minion.
Replication controller: 定义了需要启动多少pods/container。这些containers被部署在多个minions。
Service: 定义容器暴露的服务和端口,可供外部访问。一个服务通常和不同minions运行的pods里的container的某个端口相对应以供外部访问。
kubecfg: 命令行的客户端连接名master去管理Kubernetes
Kubernets是基于状态而非基于过程。如果你定义了一个pod, Kubenete则会尝试确定它在运行。如果一个container被kill, 则会重新启动一个。如果replication controller中定义了3个replicas, Kubernetes 则会一直尝试保持有3个container在运行。
让我们以Jenkins CI sever为例, 在一个典型的主从配置去分发job.
Jenkins 安装了Jenkins swarm plugin 去运行一个Jenkins master 和 多个Jenkins slaves.
Jenkins master和slaves
Jenkins is configured with theJenkins swarm pluginto run a Jenkins master and multiple Jenkins slaves, all of them running as Docker containers across multiple hosts. The swarm slaves connect to the Jenkins master on startup and become available to run Jenkins jobs. Theconfiguration files used in the exampleare available in GitHub, and the Docker images are available ascsanchez/jenkins-swarm, for the master Jenkins, extending the official Jenkins image with the swarm plugin, andcsanchez/jenkins-swarm-slave, for each of the slaves, just running the slave service on a JVM container.
Creating a Kubernetes cluster
Kubernetes provides scripts to create a cluster with several
operating systems and cloud/virtual providers: Vagrant (useful for local
testing), Google Compute Engine, Azure, Rackspace, etc.
The examples will use a local cluster running on Vagrant, using Fedora as OS, as detailed in thegetting started instructions, and have been tested on Kubernetes 0.5.4. Instead of the default three minions (Docker hosts) we are going to run just two, which is enough to show the Kubernetes capabilities without requiring a more powerful machine.
Once you havedownloaded Kubernetesand extracted it, the examples can be run from that directory. In order to create the cluster from scratch the only command needed is ./cluster/kube-up.sh.
$ export KUBERNETES_PROVIDER=vagrant
$ export KUBERNETES_NUM_MINIONS=2
$ ./cluster/kube-up.sh
Get the example configuration files:
$ git clone https://github.com/carlossg/kubernetes-jenkins.git
The cluster creation will take a while depending on machine power and
internet bandwidth, but should eventually finish without errors and it
only needs to be ran once.
Command line tool
The command line tool to interact with Kubernetes is called kubecfg, with a convenience script in cluster/kubecfg.sh.
In order to check that our cluster is up and running with two
minions, just run the kubecfg list minions command and it should display
the two virtual machines in the Vagrant configuration.
$ ./cluster/kubecfg.sh list minions
Minion identifier
----------
10.245.2.2
10.245.2.3
Pods
The Jenkins master server is defined as apodin Kubernetes terminology. Multiple containers can be specified in a pod, that would be deployed in the same Docker host, with the advantage that containers in a pod can share resources, such as storagevolumes, and use the same network namespace and IP. Volumes are by default empty directories, type emptyDir, that live for the lifespan of the pod, not the specific container, so if the container fails the persistent storage will live on. Other volume type is hostDir, that will mount a directory from the host server in the container.
In this Jenkins specific example we could have a pod with two
containers, the Jenkins server and, for instance, a MySQL container to
use as database, although we will only focus on a standalone Jenkins
master container.
In order to create a Jenkins pod we run kubecfg with the Jenkins container pod definition, using Docker image csanchez/jenkins-swarm, ports 8080 and 50000 mapped to the container in order to have access to the Jenkins web UI and the slave API, and a volume mounted in /var/jenkins_home. You can find theexample code in GitHubas well.
The Jenkins web UI pod (pod.json) is defined as follows:
{
"id": "jenkins",
"kind": "Pod",
"apiVersion": "v1beta1",
"desiredState": {
"manifest": {
"version": "v1beta1",
"id": "jenkins",
"containers": [
{
"name": "jenkins",
"image": "csanchez/jenkins-swarm:1.565.3.3",
"ports": [
{
"containerPort": 8080,
"hostPort": 8080
},
{
"containerPort": 50000,
"hostPort": 50000
}
],
"volumeMounts": [
{
"name": "jenkins-data",
"mountPath": "/var/jenkins_home"
}
]
}
],
"volumes": [
{
"name": "jenkins-data",
"source": {
"emptyDir": {}
}
}
]
}
},
"labels": {
"name": "jenkins"
}
}
And create it with:
$ ./cluster/kubecfg.sh -c kubernetes-jenkins/pod.json create pods
Name Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
jenkins csanchez/jenkins-swarm:1.565.3.3
After some time, depending on your internet connection, as it has to
download the Docker image to the minion, we can check its status and in
which minion is started.
$ ./cluster/kubecfg.sh list pods
Name Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
jenkins csanchez/jenkins-swarm:1.565.3.3 10.0.29.247/10.0.29.247 name=jenkins Running
If we ssh into the minion that the pod was assigned to, minion-1 or
minion-2, we can see how Docker started the container defined, amongst
other containers used by Kubernetes for internal management
(kubernetes/pause and google/cadvisor).
$ vagrant ssh minion-2 -c "docker ps"
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7f6825a80c8a google/cadvisor:0.6.2 "/usr/bin/cadvisor" 3 minutes ago Up 3 minutes k8s_cadvisor.b0dae998_cadvisormanifes12uqn2ohido76855gdecd9roadm7l0.default.file_cadvisormanifes12uqn2ohido76855gdecd9roadm7l0_28df406a
5c02249c0b3c csanchez/jenkins-swarm:1.565.3.3 "/usr/local/bin/jenk 3 minutes ago Up 3 minutes k8s_jenkins.f87be3b0_jenkins.default.etcd_901e8027-759b-11e4-bfd0-0800279696e1_bf8db75a
ce51fda15f55 kubernetes/pause:go "/pause" 10 minutes ago Up 10 minutes k8s_net.dbcb7509_0d38f5b2-759c-11e4-bfd0-0800279696e1.default.etcd_0d38fa52-759c-11e4-bfd0-0800279696e1_e4e3a40f
e6f00165d7d3 kubernetes/pause:go "/pause" 13 minutes ago Up 13 minutes 0.0.0.0:8080->8080/tcp, 0.0.0.0:50000->50000/tcp k8s_net.9eb4a781_jenkins.default.etcd_901e8027-759b-11e4-bfd0-0800279696e1_7bd4d24e
7129fa5dccab kubernetes/pause:go "/pause" 13 minutes ago Up 13 minutes 0.0.0.0:4194->8080/tcp k8s_net.a0f18f6e_cadvisormanifes12uqn2ohido76855gdecd9roadm7l0.default.file_cadvisormanifes12uqn2ohido76855gdecd9roadm7l0_659a7a52
And, once we know the container id, we can check the container logs with vagrant ssh minion-1 -c "docker logs cec3eab3f4d3"
We should also see the Jenkins web UI at http://10.245.2.2:8080/ or
http://10.0.29.247:8080/, depending on what minion it was started in.
Service discovery
Kubernetes allows defining services, a way for containers to use
discovery and proxy requests to the appropriate minion. With this
definition in service-http.json we are creating a service with id
jenkins pointing to the pod with the label name=jenkins, as declared in
the pod definition, and forwarding the port 8888 to the container's
8080.
{
"id": "jenkins",
"kind": "Service",
"apiVersion": "v1beta1",
"port": 8888,
"containerPort": 8080,
"selector": {
"name": "jenkins"
}
}
Creating the service with kubecfg:
$ ./cluster/kubecfg.sh -c kubernetes-jenkins/service-http.json create services
Name Labels Selector IP Port
---------- ---------- ---------- ---------- ----------
jenkins name=jenkins 10.0.29.247 8888
Each service is assigned a unique IP address tied to the lifespan of
the Service. If we had multiple pods matching the service definition the
service would load balance the traffic across all of them.
Another feature of services is that a number of environment variables are available for any subsequent containers ran by Kubernetes, providing the ability to connect to the service container, in a similar way as runninglinked Docker containers. This will provide useful for finding the master Jenkins server from any of the slaves.
JENKINS_PORT='tcp://10.0.29.247:8888'
JENKINS_PORT_8080_TCP='tcp://10.0.29.247:8888'
JENKINS_PORT_8080_TCP_ADDR='10.0.29.247'
JENKINS_PORT_8080_TCP_PORT='8888'
JENKINS_PORT_8080_TCP_PROTO='tcp'
JENKINS_SERVICE_PORT='8888'
SERVICE_HOST='10.0.29.247'
Another tweak we need to do is to open port 50000, needed by the
Jenkins swarm plugin. It can be achieved creating another service
service-slave.json so Kubernetes forwards traffic to that port to the
Jenkins server container.
{
"id": "jenkins-slave",
"kind": "Service",
"apiVersion": "v1beta1",
"port": 50000,
"containerPort": 50000,
"selector": {
"name": "jenkins"
}
}
The service is created with kubecfg again.
$ ./cluster/kubecfg.sh -c kubernetes-jenkins/service-slave.json create services
Name Labels Selector IP Port
---------- ---------- ---------- ---------- ----------
jenkins-slave name=jenkins 10.0.86.28 50000
An all the defined services are available now, including some Kubernetes internal ones:
$ ./cluster/kubecfg.sh list services
Name Labels Selector IP Port
---------- ---------- ---------- ---------- ----------
kubernetes-ro component=apiserver,provider=kubernetes 10.0.22.155 80
kubernetes component=apiserver,provider=kubernetes 10.0.72.49 443
jenkins name=jenkins 10.0.29.247 8888
jenkins-slave name=jenkins 10.0.86.28 50000
Replication controllers
Replication controllers allow running multiple pods in multiple
minions. Jenkins slaves can be run this way to ensure there is always a
pool of slaves ready to run Jenkins jobs.
In a replication.json definition:
{
"id": "jenkins-slave",
"apiVersion": "v1beta1",
"kind": "ReplicationController",
"desiredState": {
"replicas": 1,
"replicaSelector": {
"name": "jenkins-slave"
},
"podTemplate": {
"desiredState": {
"manifest": {
"version": "v1beta1",
"id": "jenkins-slave",
"containers": [
{
"name": "jenkins-slave",
"image": "csanchez/jenkins-swarm-slave:1.21",
"command": [
"sh", "-c", "/usr/local/bin/jenkins-slave.sh -master http://$JENKINS_SERVICE_HOST:$JENKINS_SERVICE_PORT -tunnel $JENKINS_SLAVE_SERVICE_HOST:$JENKINS_SLAVE_SERVICE_PORT -username jenkins -password jenkins -executors 1"
]
}
]
}
},
"labels": {
"name": "jenkins-slave"
}
}
},
"labels": {
"name": "jenkins-slave"
}
}
The podTemplate section allows the same configuration options as a
pod definition. In this case we want to make the Jenkins slave connect
automatically to our Jenkins master, instead of relying on Jenkins
multicast discovery. To do so we execute the jenkins-slave.sh command
with -master parameter to point the slave to the Jenkins master running
in Kubernetes. Note that we use the Kubernetes provided environment
variables for the Jenkins service definition (JENKINS_SERVICE_HOST and
JENKINS_SERVICE_PORT). The image command is overridden to configure the
container this way, useful to reuse existing images while taking
advantage of the service environment variables. It can be done in pod
definitions too.
Create the replicas with kubecfg:
$ ./cluster/kubecfg.sh -c kubernetes-jenkins/replication.json create replicationControllers
Name Image(s) Selector Replicas
---------- ---------- ---------- ----------
jenkins-slave csanchez/jenkins-swarm-slave:1.21 name=jenkins-slave 1
Listing the pods now would show new ones being created, up to the number of replicas defined in the replication controller.
$ ./cluster/kubecfg.sh list pods
Name Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
jenkins csanchez/jenkins-swarm:1.565.3.3 10.245.2.3/10.245.2.3 name=jenkins Running
07651754-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.2/10.245.2.2 name=jenkins-slave Pending
The first time running jenkins-swarm-slave image the minion has to
download it from the Docker repository, but after a while, depending on
your internet connection, the slaves should automatically connect to the
Jenkins server. Going into the server where the slave is started,
docker ps has to show the container running and docker logs is useful to
debug any problems on container startup.
$ vagrant ssh minion-1 -c "docker ps"
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
870665d50f68 csanchez/jenkins-swarm-slave:1.21 "/usr/local/bin/jenk About a minute ago Up About a minute k8s_jenkins-slave.74f1dda1_07651754-4f88-11e4-b01e-0800279696e1.default.etcd_11cac207-759f-11e4-bfd0-0800279696e1_9495d10e
cc44aa8743f0 kubernetes/pause:go "/pause" About a minute ago Up About a minute k8s_net.dbcb7509_07651754-4f88-11e4-b01e-0800279696e1.default.etcd_11cac207-759f-11e4-bfd0-0800279696e1_4bf086ee
edff0e535a84 google/cadvisor:0.6.2 "/usr/bin/cadvisor" 27 minutes ago Up 27 minutes k8s_cadvisor.b0dae998_cadvisormanifes12uqn2ohido76855gdecd9roadm7l0.default.file_cadvisormanifes12uqn2ohido76855gdecd9roadm7l0_588941b0
b7e23a7b68d0 kubernetes/pause:go "/pause" 27 minutes ago Up 27 minutes 0.0.0.0:4194->8080/tcp k8s_net.a0f18f6e_cadvisormanifes12uqn2ohido76855gdecd9roadm7l0.default.file_cadvisormanifes12uqn2ohido76855gdecd9roadm7l0_57a2b4de
The replication controller can automatically be resized to any number of desired replicas:
$ ./cluster/kubecfg.sh resize jenkins-slave 2
And again the pods are updated to show where each replica is running.
$ ./cluster/kubecfg.sh list pods
Name Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
07651754-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.2/10.245.2.2 name=jenkins-slave Running
a22e0d59-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.3/10.245.2.3 name=jenkins-slave Pending
jenkins csanchez/jenkins-swarm:1.565.3.3 10.245.2.3/10.245.2.3 name=jenkins Running
Scheduling
Right now the default scheduler is random, but resource based scheduling will be implemented soon. At the time of writing there are several issues opened to add scheduling based on memory and CPU usage. There is also work in progress in anApache Mesos based scheduler. Apache Mesos is a framework for distributed systems providing APIs for resource management and scheduling across entire datacenter and cloud environments.
Self healing
One of the benefits of using Kubernetes is the automated management and recovery of containers.
If the container running the Jenkins server dies for any reason, for
instance because the process being ran crashes, Kubernetes will notice
and will create a new container after a few seconds.
$ vagrant ssh minion-2 -c 'docker kill `docker ps | grep csanchez/jenkins-swarm: | sed -e "s/ .*//"`'
51ba3687f4ee
$ ./cluster/kubecfg.sh list pods
Name Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
jenkins csanchez/jenkins-swarm:1.565.3.3 10.245.2.3/10.245.2.3 name=jenkins Failed
07651754-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.2/10.245.2.2 name=jenkins-slave Running
a22e0d59-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.3/10.245.2.3 name=jenkins-slave Running
And some time later, typically no more than a minute...
Name Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
jenkins csanchez/jenkins-swarm:1.565.3.3 10.245.2.3/10.245.2.3 name=jenkins Running
07651754-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.2/10.245.2.2 name=jenkins-slave Running
a22e0d59-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.3/10.245.2.3 name=jenkins-slave Running
Running the Jenkins data dir in a volume we guarantee that the data
is kept even after the container dies, so we do not lose any Jenkins
jobs or data created. And because Kubernetes is proxying the services in
each minion the slaves will reconnect to the new Jenkins server
automagically no matter where they run! And exactly the same will happen
if any of the slave containers dies, the system will automatically
create a new container and thanks to the service discovery it will
automatically join the Jenkins server pool.
If something more drastic happens, like a minion dying, Kubernetes
does not offer yet the ability to reschedule the containers in the other
existing minions, it would just show the pods as Failed.
$ vagrant halt minion-2
==> minion-2: Attempting graceful shutdown of VM...
$ ./cluster/kubecfg.sh list pods
Name Image(s) Host Labels Status
---------- ---------- ---------- ---------- ----------
jenkins csanchez/jenkins-swarm:1.565.3.3 10.245.2.3/10.245.2.3 name=jenkins Failed
07651754-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.2/10.245.2.2 name=jenkins-slave Running
a22e0d59-4f88-11e4-b01e-0800279696e1 csanchez/jenkins-swarm-slave:1.21 10.245.2.3/10.245.2.3 name=jenkins-slave Failed
Tearing down
kubecfg offers several commands to stop and delete the replication controllers, pods and services definitions.
To stop the replication controller, setting the number of replicas to
0, and causing the termination of all the Jenkins slaves containers:
$ ./cluster/kubecfg.sh stop jenkins-slave
To delete it:
$ ./cluster/kubecfg.sh rm jenkins-slave
To delete the jenkins server pod, causing the termination of the Jenkins master container:
$ ./cluster/kubecfg.sh delete pods/jenkins
To delete the services:
$ ./cluster/kubecfg.sh delete services/jenkins
$ ./cluster/kubecfg.sh delete services/jenkins-slave
Conclusion
Kubernetes is still a very young project, but highly promising to
manage Docker deployments across multiple servers and simplify the
execution of long running and distributed Docker containers. By
abstracting infrastructure concepts and working on states instead of
processes, it provides easy definition of clusters, including self
healing capabilities out of the box. In short, Kubernetes makes
management of Docker fleets easier.
About the Author
Carlos Sanchezhas been working on automation and quality of software development, QA and operations processes for over 10 years, from build tools and continuous integration to deployment automation, DevOps best practices and continuous delivery. He has delivered solutions to Fortune 500 companies, working at several US based startups, most recently MaestroDev, a company he cofounded. Carlos has been a speaker at several conferences around the world, including JavaOne, EclipseCON, ApacheCON, JavaZone, Fosdem or PuppetConf. Very involved in open source, he is a member of the Apache Software Foundation amongst other open source groups, contributing to several projects, such as Apache Maven, Fog or Puppet.