Docker如何完成它需要完成的东西?
两个词:cgroups和union文件系统。Docker使用cgroup来提供容器隔离,而union文件系统用于保存镜像并使容器变得短暂。
Cgroups
这是Linux内核功能,它让两件事情变成可能:
- 限制Linux进程组的资源占用(内存、CPU)
- 为进程组制作 PID、UTS、IPC、网络、用户及装载命名空间
这里的关键词是命名空间。比如说,一个PID命名空间允许它里面的进程使用隔离的PID,并与主PID命名空间独立开来,因此你可以在一个PID命名空间里拥有自己的PID为1的初始化进程。其他命名空间与此类似。然后你可以使用cgroup创建一个环境,进程可以在其中运行,并与操作系统的其他进程隔离开,但这里的关键点是这个环境上的进程使用的是已经加载和运行的内核,因此额外开销与运行其他进程几乎是一样的。Chroot之于cgroup就好像我之于绿巨人(The Hulk)、贝恩(Bane)和毒液(Venom)的组合(译者注:本文作者非常瘦弱,后三者都非常强壮)。
Union文件系统
Union文件系统允许通过union装载变化的分层叠加。在union文件系统里,文件系统可以被装载在其他文件系统之上,其结果就是一个变化的分层集合。每个装载的文件系统表示前一个文件系统之后的变化集合,就像是一个diff。
当你下载一个镜像,修改它,然后保存成新版本,你只是创建了加载在包裹基础镜像的初始层上的一个新的union文件系统。这使得Docker镜像非常轻,比如:你的DB、Nginx和Syslog镜像都可以共享同一个Ubuntu基础,每一个镜像保存的只是在基础之上工作需要的变化。
截至2015年1月4日,Docker允许在union文件系统中使用aufs、btrfs或设备映射(device mapper)。
镜像
我们来看一下postgresql的一个镜像:
[{"AppArmorProfile": "","Args": [ "postgres"],"Config": { "AttachStderr": true, "AttachStdin": false, "AttachStdout": true, "Cmd": [ "postgres" ], "CpuShares": 0, "Cpuset": "", "Domainname": "", "Entrypoint": [ "/docker-entrypoint.sh" ], "Env": [ "PATH=/usr/lib/postgresql/9.3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "LANG=en_US.utf8", "PG_MAJOR=9.3", "PG_VERSION=9.3.5-1.pgdg70 1", "PGDATA=/var/lib/postgresql/data" ], "ExposedPorts": { "5432/tcp": {} }, "Hostname": "6334a2022f21", "Image": "postgres", "MacAddress": "", "Memory": 0, "MemorySwap": 0, "NetworkDisabled": false, "OnBuild": null, "OpenStdin": false, "PortSpecs": null, "StdinOnce": false, "Tty": false, "User": "", "Volumes": { "/var/lib/postgresql/data": {} }, "WorkingDir": ""},"Created": "2015-01-03T23:56:12.354896658Z","Driver": "devicemapper","ExecDriver": "native-0.2","HostConfig": { "Binds": null, "CapAdd": null, "CapDrop": null, "ContainerIDFile": "", "Devices": null, "Dns": null, "DnsSearch": null, "ExtraHosts": null, "IpcMode": "", "Links": null, "LxcConf": null, "NetworkMode": "", "PortBindings": null, "Privileged": false, "PublishAllPorts": false, "RestartPolicy": { "MaximumRetryCount": 0, "Name": "" }, "SecurityOpt": null, "VolumesFrom": [ "bestwebappever.dev.db-data" ]},"HostnamePath": "/mnt/docker/containers/6334a2022f213f9534b45df33c64437081a38d50c7f462692b019185b8cbc6da/hostname","HostsPath": "/mnt/docker/containers/6334a2022f213f9534b45df33c64437081a38d50c7f462692b019185b8cbc6da/hosts","Id": "6334a2022f213f9534b45df33c64437081a38d50c7f462692b019185b8cbc6da","Image": "aaab661c1e3e8da2d9fc6872986cbd7b9ec835dcd3886d37722f1133baa3d2db","MountLabel": "","Name": "/bestwebappever.dev.db","NetworkSettings": { "Bridge": "docker0", "Gateway": "172.17.42.1", "IPAddress": "172.17.0.176", "IPPrefixLen": 16, "MacAddress": "02:42:ac:11:00:b0", "PortMapping": null, "Ports": { "5432/tcp": null }},"Path": "/docker-entrypoint.sh","ProcessLabel": "","ResolvConfPath": "/mnt/docker/containers/6334a2022f213f9534b45df33c64437081a38d50c7f462692b019185b8cbc6da/resolv.conf","State": { "Error": "", "ExitCode": 0, "FinishedAt": "0001-01-01T00:00:00Z", "OOMKilled": false, "Paused": false, "Pid": 21654, "Restarting": false, "Running": true, "StartedAt": "2015-01-03T23:56:42.003405983Z"},"Volumes": { "/var/lib/postgresql/data": "/mnt/docker/vfs/dir/5ac73c52ca86600a82e61279346dac0cb3e173b067ba9b219ea044023ca67561", "postgresql_data": "/mnt/docker/vfs/dir/abace588b890e9f4adb604f633c280b9b5bed7d20285aac9cc81a84a2f556034"},"VolumesRW": { "/var/lib/postgresql/data": true, "postgresql_data": true}}]
就是这样,镜像只是一个json,它指定了从该镜像运行的容器的特性,union装载点保存在哪里,要暴露什么端口等等。每个镜像与一个union文件系统相关联,每个Docker上的union文件系统都有一个上层,就像是计算机科技树(不像其他树有一大堆的家族)。如果它看起来有点吓人或有些东西串不起来,不要担心,这只是出于教学目的,你并不需要直接处理这些文件。
容器
容器之所以是短暂的,是因为当你从镜像上创建一个容器,Docker会创建一个空白的union文件系统加载在与该镜像关联的union文件系统之上。
由于union文件系统是空白的,这意味着没有变化会被应用到镜像的文件系统上,你创建的变化会得到体现,但是当容器停止,该容器的union文件系统会被丢弃,留下的是你启动时的原始镜像文件系统。除非你创建一个新的镜像,或制作一个卷,你所做的变化在容器停止时都会消失。
卷所做的是在容器内指定一个目录,以便在union文件系统之外保存它。
这是一个bestwebappever的容器:
[{"AppArmorProfile": "","Args": [],"Config": { "AttachStderr": true, "AttachStdin": false, "AttachStdout": true, "Cmd": [ "/sbin/my_init" ], "CpuShares": 0, "Cpuset": "", "Domainname": "", "Entrypoint": null, "Env": [ "DJANGO_CONFIGURATION=Local", "HOME=/root", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "TALPOR_ENVIRONMENT=local", "TALPOR_DIR=/opt/bestwebappever" ], "ExposedPorts": { "80/tcp": {} }, "Hostname": "44a87fdaf870", "Image": "talpor/bestwebappever:dev", "MacAddress": "", "Memory": 0, "MemorySwap": 0, "NetworkDisabled": false, "OnBuild": null, "OpenStdin": false, "PortSpecs": null, "StdinOnce": false, "Tty": false, "User": "", "Volumes": { "/opt/bestwebappever": {} }, "WorkingDir": "/opt/bestwebappever"},"Created": "2015-01-03T23:56:15.378511619Z","Driver": "devicemapper","ExecDriver": "native-0.2","HostConfig": { "Binds": [ "/home/german/bestwebappever/:/opt/bestwebappever:rw" ], "CapAdd": null, "CapDrop": null, "ContainerIDFile": "", "Devices": null, "Dns": null, "DnsSearch": null, "ExtraHosts": null, "IpcMode": "", "Links": [ "/bestwebappever.dev.db:/bestwebappever.dev.app/db", "/bestwebappever.dev.redis:/bestwebappever.dev.app/redis" ], "LxcConf": null, "NetworkMode": "", "PortBindings": { "80/tcp": [ { "HostIp": "", "HostPort": "8887" } ] }, "Privileged": false, "PublishAllPorts": false, "RestartPolicy": { "MaximumRetryCount": 0, "Name": "" }, "SecurityOpt": null, "VolumesFrom": [ "bestwebappever.dev.requirements-data" ]},"HostnamePath": "/mnt/docker/containers/44a87fdaf870281e86160e9e844b8987cfefd771448887675fed99460de491c4/hostname","HostsPath": "/mnt/docker/containers/44a87fdaf870281e86160e9e844b8987cfefd771448887675fed99460de491c4/hosts","Id": "44a87fdaf870281e86160e9e844b8987cfefd771448887675fed99460de491c4","Image": "b84804fac17b61fe8f344359285186f1a63cd8c0017930897a078cd09d61bb60","MountLabel": "","Name": "/bestwebappever.dev.app","NetworkSettings": { "Bridge": "docker0", "Gateway": "172.17.42.1", "IPAddress": "172.17.0.179", "IPPrefixLen": 16, "MacAddress": "02:42:ac:11:00:b3", "PortMapping": null, "Ports": { "80/tcp": [ { "HostIp": "0.0.0.0", "HostPort": "8887" } ] }},"Path": "/sbin/my_init","ProcessLabel": "","ResolvConfPath": "/mnt/docker/containers/44a87fdaf870281e86160e9e844b8987cfefd771448887675fed99460de491c4/resolv.conf","State": { "Error": "", "ExitCode": 0, "FinishedAt": "0001-01-01T00:00:00Z", "OOMKilled": false, "Paused": false, "Pid": 21796, "Restarting": false, "Running": true, "StartedAt": "2015-01-03T23:56:47.537259546Z"},"Volumes": { "/opt/bestwebappever": "/home/german/bestwebappever", "requirements_data": "/mnt/docker/vfs/dir/bc14bec26ca311d5ed9f2a83eebef872a879c9e2f1d932470e0fd853fe8be336"},"VolumesRW": { "/opt/bestwebappever": true, "requirements_data": true}}]
基本上与镜像相同,不过现在还指定了一些暴露给宿主的端口,也声明了卷位于宿主的位置,容器状态是从现在直到结束,等等。与前面一样,如果它看起来让人生畏,不要担心,你不需要直接处理这些json。