1:前言
在docker1.12版本之前,众所周知dokcer本身只能单机上运行,而集群则要依赖mesos、kubernetes、swarm等集群管理方案。其中swarm是docker公司自己的容器集群管理工具,在当时的热度还是低于前两者。docker1.12.0版本发布中,Docker公司出于战略眼光,将swarm集成到docker-engine中,使docker内置了集群解决方案。于是swarm这个“亲儿子”的江湖地位迅速提升,在docker集群方案中与mesos,k8s形成三足鼎立之势,在未来则大有赶超之势。
2:规划
IP 主机名 角色
10.10.32.245 swarm1 swarm manager
10.10.32.246 swarm2 worker node
10.10.32.247 swarm3 worker node
10.10.32.248 swarm4 worker node
3:集群通信原理
通过/usr/lib/system.d/system/docker.service配置开启2375管理端口
ExecStart=/usr/bin/dockerd -s overlay --insecure-registry registry.cntv.net -H tcp://0.0.0.0:2375 -H unix:///va
每个node节点上,都开启2375远程管理端口,swarm集群管理通过2375管理端口来执行
4:swarm调度原理
swarm中实现调度功能的主要是scheduler模块,其中包括Filter和strategy。
Filter:根据下发的任务的资源需求,用来过滤节点,从集群所有节点中找出满足条件的节点(比如资源足够,节点正常等等)
Strategy:用来在过滤出的节点中根据策略选择一个最优的节点(比如对找出的节点进行对比,找到资源最多的节点等等)
5:swarm集群创建
(1)初始化manager节点
10.10.32.245
$docker swarm init --advertise-addr 10.10.32.245 Swarm initialized: current node (b73yii7s7rn321ejz8n3ch7ay) is now a manager. To add a worker to this swarm, run the following command: docker swarm join \ --token SWMTKN-1-2z3obrzww2zcnwaiq5dzocpmvz9kty25usroy3gh3xux0l32uo-8dxurdbms8yli2crdxbmyk7y2 \ 10.10.32.245:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
(2)manage查看节点swarm状态
10.10.32.245
$docker info Swarm: active NodeID: ax13h89zlb5dqw980ddo1ox4f Is Manager: true ClusterID: d5nam0mink4wdk506o1ybbs9e Managers: 1 Nodes: 1
(3)manage查看swarm 节点
10.10.32.245
$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ax13h89zlb5dqw980ddo1ox4f * swarm1 Ready Active Leader
(4)node节点加入swarm集群
10.10.32.246 10.10.32.247 10.10.32.248
$ docker swarm join-token manager #manage上查询加入命令及token To add a manager to this swarm, run the following command: docker swarm join \ --token SWMTKN-1-2z3obrzww2zcnwaiq5dzocpmvz9kty25usroy3gh3xux0l32uo-8dxurdbms8yli2crdxbmyk7y2 \ 10.10.32.245:2377 $ docker swarm join \ --token SWMTKN-1-2z3obrzww2zcnwaiq5dzocpmvz9kty25usroy3gh3xux0l32uo-8dxurdbms8yli2crdxbmyk7y2 \ 10.10.32.245:2377 This node joined a swarm as a worker
(5)manage查看swarm节点
$ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 03b5w49n7uz663xwjuof3dsk0 swarm4 Ready Active 0g039jomhpad819a39yvkow6y swarm2 Ready Active 4kos85v5z6jh6lahwv8yjpku8 swarm3 Ready Active ax13h89zlb5dqw980ddo1ox4f * swarm1 Ready Active Leader
(6)使节点离开集群
$ docker swarm leave Node left the swarm.
6:在swarm集群上创建服务
(1)创建服务
$docker service create --network net3 --replicas 4 -p 8099:80 --name time-php registry.cntv.net/heqin/tvtime-php:v0.85dongsi bdkwsgchcydz5ixatrbiyr1fx
(2)查看服务
$ docker service ls #查看服务 ID NAME REPLICAS IMAGE COMMAND bdkwsgchcydz time-php 4/4 registry.cntv.net/heqin/tvtime-php:v0.85dongsi
$docker service ps time-php #查看服务的实例 ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR d15ambrdd7pyi22951vi497xa time-php.1 registry.cntv.net/heqin/tvtime-php:v0.85dongsi swarm4 Running Running 5 minutes ago 3wtmmggo65cmyutg7zwikb94s time-php.2 registry.cntv.net/heqin/tvtime-php:v0.85dongsi swarm3 Running Running 5 minutes ago 5ulbhndwgswpicx8g62af8k5z time-php.3 registry.cntv.net/heqin/tvtime-php:v0.85dongsi swarm2 Running Running 5 minutes ago 4b9rh8aiwfi88xrdgmyss9ili time-php.4 registry.cntv.net/heqin/tvtime-php:v0.85dongsi swarm2 Running Running 5 minutes ago $docker service inspect time-php #查看服务的详细信息 [ { "ID": "bdkwsgchcydz5ixatrbiyr1fx", "Version": { "Index": 3074 }, "CreatedAt": "2016-08-31T01:41:18.892483521Z", "UpdatedAt": "2016-08-31T01:41:18.909164422Z", "Spec": { "Name": "time-php", "TaskTemplate": { "ContainerSpec": { "Image": "registry.cntv.net/heqin/tvtime-php:v0.85dongsi" }, "Resources": { "Limits": {}, "Reservations": {} }, "RestartPolicy": { "Condition": "any", "MaxAttempts": 0 }, "Placement": {} }, "Mode": { "Replicated": { "Replicas": 4 } }, "UpdateConfig": { "Parallelism": 1, "FailureAction": "pause" }, "Networks": [ { "Target": "d83qc9rgkkj1ws8kvmod4x759" } ], "EndpointSpec": { "Mode": "vip", "Ports": [ { "Protocol": "tcp", "TargetPort": 80, "PublishedPort": 8099 } ] } }, "Endpoint": { "Spec": { "Mode": "vip", "Ports": [ { "Protocol": "tcp", "TargetPort": 80, "PublishedPort": 8099 } ] }, "Ports": [ { "Protocol": "tcp", "TargetPort": 80, "PublishedPort": 8099 } ], "VirtualIPs": [ { "NetworkID": "biu7m9hi8fgbbihfzg48whxqi", "Addr": "10.255.0.19/16" }, { "NetworkID": "d83qc9rgkkj1ws8kvmod4x759", "Addr": "10.88.0.2/24" } ] }, "UpdateStatus": { "StartedAt": "0001-01-01T00:00:00Z", "CompletedAt": "0001-01-01T00:00:00Z" } } ]
(3)扩展服务实例数
$ docker service scale time-php=10 time-php scaled to 10 $ docker service ls ID NAME REPLICAS IMAGE bdkwsgchcydz time-php 6/6 registry.cntv.net/heqin/tvtime-php:v0.85dongsi
(4)更新服务
$ docker service update --p_w_picpath registry.cntv.net/heqin/tvtime-php:v0.84xidan --log-driver=syslog time-php time-php $ docker service ls ID NAME REPLICAS IMAGE bdkwsgchcydz time-php 6/6 registry.cntv.net/heqin/tvtime-php:v0.84xidan
(5)删除服务
$docker service rm time-php time-php
7:网络
(1)查看容器网络
$docker network ls NETWORK ID NAME DRIVER SCOPE a17f16650bf9 bridge bridge local #容器使用独立网络Namespace,连接docker0虚拟网卡(默认模式) f1c102babcf8 host host local #容器与主机共享网络Namespace,拥有主机相同网卡 b6a2efce65ef none null local #容器没有任何网卡,适合不需要与外部通信的容器 asa2hv41mtci ingress overlay swarm #swarm集群的的overlay网络,容器可以跨主机网络通信
注:上面三个网络是docker安装后默认带的三个网络,第四个overlay网络是启动swarm之后默认创建的overlay网络。
(2)查看overlay网络信息
$docker network inspect ingress [ { "Name": "ingress", "Id": "asa2hv41mtci2qzjkaonusnc8", "Scope": "swarm", "Driver": "overlay", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [ { "Subnet": "10.255.0.0/16", "Gateway": "10.255.0.1" } ] }, "Internal": false, "Containers": null, "Options": { "com.docker.network.driver.overlay.vxlanid_list": "258" }, "Labels": null } ]
(3)创建自定义的overlay网络
$docker network create --driver=overlay --subnet=10.88.0.0/24 --gateway=10.88.0.1 net3 c654pb76q9jnni5bpdvd34rg4 $docker network ls NETWORK ID NAME DRIVER SCOPE c654pb76q9jn net3 overlay swarm $docker network inspect net3 [ { "Name": "net3", "Id": "c654pb76q9jnni5bpdvd34rg4", "Scope": "swarm", "Driver": "overlay", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [ { "Subnet": "10.88.0.0/24", "Gateway": "10.88.0.1" } ] }, "Internal": false, "Containers": null, "Options": { "com.docker.network.driver.overlay.vxlanid_list": "258" }, "Labels": null } ]
注:在manage上创建overlay网络后,各node上并没有同步创建该网络,只有当使用该overlay网络的容器在node节点上运行时,才会在node节点上自动创建,容器删除后,网络也会在node节点上删除。
(4)将容器绑定到自定义的网络中
$docker service create --network net3 --replicas 4 --name time-php registry.cntv.net/heqin/tvtime-php:v0.85dongsi 5qiv4hhv7ra5p65fpq2v6ok0x $docker service ls 5qiv4hhv7ra5 time-php 4/4 registry.cntv.net/heqin/tvtime-php:v0.85dongsi $docker service ps time-php ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR 529kuyazgxxl8l0eds4f740nu time-php.1 registry.cntv.net/heqin/tvtime-php:v0.85dongsi swarm3 Running Running 32 seconds ago brdelpbuwm3p9cujy6k78cpzo time-php.2 registry.cntv.net/heqin/tvtime-php:v0.85dongsi swarm3 Running Running 33 seconds ago 3csp6fmrsg5cvski270732q98 time-php.3 registry.cntv.net/heqin/tvtime-php:v0.85dongsi swarm2 Running Running 34 seconds ago 2ismj5zt9o3t2hfe1j1lx7dcw time-php.4 registry.cntv.net/heqin/tvtime-php:v0.85dongsi swarm4 Running Running 33 seconds ago
进入swarm3 ,可以看到net3网络下有两个容器。
$docker docker network inspect net3 [ { "Name": "net3", "Id": "d83qc9rgkkj1ws8kvmod4x759", "Scope": "swarm", "Driver": "overlay", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [ { "Subnet": "10.88.0.0/24", "Gateway": "10.88.0.1" } ] }, "Internal": false, "Containers": { "d465611683b69f20c28ec9eb276a9e4088867e35508ecf2bf28c07c5d3e1b33a": { "Name": "time-php.1.529kuyazgxxl8l0eds4f740nu", "EndpointID": "ca667c89550f88eabf54eeaaaa3c55c678ec8184f121fc845bbaab1cd4be20ff", "MacAddress": "02:42:0a:58:00:05", "IPv4Address": "10.88.0.5/24", "IPv6Address": "" }, "d8fa64c4f8357418304ed1f19a049e4aac196c94ef50a27a0686ebe4ee734225": { "Name": "time-php.2.brdelpbuwm3p9cujy6k78cpzo", "EndpointID": "ab45bcf8e73d114e9bdf97cac128750a6c7a073149783746a9962c6cd4c599a8", "MacAddress": "02:42:0a:58:00:06", "IPv4Address": "10.88.0.6/24", "IPv6Address": "" } }, "Options": { "com.docker.network.driver.overlay.vxlanid_list": "257" }, "Labels": {} } ]
用swarm3的容器去ping swarm2主机上的容器,是可以ping通的
$docker exec -it d8fa64c4f835 ping 10.88.0.3 PING 10.88.0.3 (10.88.0.3) 56(84) bytes of data. 64 bytes from 10.88.0.3: icmp_seq=1 ttl=64 time=0.611 ms 64 bytes from 10.88.0.3: icmp_seq=2 ttl=64 time=5.45 ms 64 bytes from 10.88.0.3: icmp_seq=3 ttl=64 time=0.439 ms 64 bytes from 10.88.0.3: icmp_seq=4 ttl=64 time=0.586 ms 64 bytes from 10.88.0.3: icmp_seq=5 ttl=64 time=0.603 ms
8:swarm基本命令
swarm集群创建与管理
docker swarm Command
Commands:
init Initialize a swarm
join Join a swarm as a node and/or manager
join-token Manage join tokens
update Update the swarm
leave Leave a swarm
swarm服务创建与管理
docker service Command
Commands:
create Create a new service
inspect Display detailed information on one or more services
ps List the tasks of a service
ls List services
rm Remove one or more services
scale Scale one or multiple services
update Update a service
swarm服务创建选项
docker service create [OPTIONS] IMAGE
Options:
--constraint value Placement constraints (default [])
--container-label value Container labels (default [])
--endpoint-mode string Endpoint mode (vip or dnsrr)
-e, --env value Set environment variables (default [])
--help Print usage
-l, --label value Service labels (default [])
--limit-cpu value Limit CPUs (default 0.000)
--limit-memory value Limit Memory (default 0 B)
--log-driver string Logging driver for service
--log-opt value Logging driver options (default [])
--mode string Service mode (replicated or global) (default "replicated")
--mount value Attach a mount to the service
--name string Service name
--network value Network p_w_uploads (default [])
-p, --publish value Publish a port as a node port (default [])
--replicas value Number of tasks (default none)
--reserve-cpu value Reserve CPUs (default 0.000)
--reserve-memory value Reserve Memory (default 0 B)
--restart-condition string Restart when condition is met (none, on-failure, or any)
--restart-delay value Delay between restart attempts (default none)
--restart-max-attempts value Maximum number of restarts before giving up (default none)
--restart-window value Window used to evaluate the restart policy (default none)
--stop-grace-period value Time to wait before force killing a container (default none)
--update-delay duration Delay between updates
--update-failure-action string Action on update failure (pause|continue) (default "pause")
--update-parallelism uint Maximum number of tasks updated simultaneously (0 to update all at once) (default 1)
-u, --user string Username or UID
9:利用2375端口远程管理
注:-H 指定远程地址和端口,即可在任意一台有docker客户端的机器远程操作swarm集群
$docker -H 10.00.32.245:2375 service ls ID NAME REPLICAS IMAGE COMMAND 3e0ihi2lnnaa test_api 1/1 registry.cntv.net/heqin/api-web:1.1 5mdqw6r53tom test_apimem 1/1 registry.cntv.net/heqin/memcache-1.4 5ucxo9wqywo9 test_tomcatmem 1/1 registry.cntv.net/heqin/memcache-1.4 cjd1z14ug28g HQtest_tvtime 5/5 registry.cntv.net/heqin/tvtime-php:v0.85dongsi ex4mahzt5k8u test_tomcat 1/1 registry.cntv.net/heqin/jdk7-tomcat7.0.52:201608011wq
10:swarm与mesos对比
(1):部署配置简单,集群管理便利,但是没有web管理界面
(2):每个节点无需手动安装agent,集群间只需要通过暴露tcp://2375端口来保证节点间的通信。
(3):无需bamboo来做负载均衡,swarm集群使用IPVS负载均衡
(4):无需marathon、chrons等调度组件,swarm内置调度模块。