准备三台虚拟机,对应网络条件如下:
IP地址 | 主机名 | 作用 |
---|---|---|
172.18.74.26 | manager | 管理节点 |
172.18.74.29 | g160402 | worker |
172.18.74.25 | u180402 | worker |
按照上述条件修改主机名,并向/etc/hosts添加其他两个节点的解析配置
将所有的节点的 docker daemon 的监听方式更改为0.0.0.0:2375
配置一
#修改[service] ExecStart 行如下
example@manager:~$ sudo vi /lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd -H 0.0.0.0:2375 -H unix:///var/run/docker.sock
example@manager:~$ sudo systemctl daemon-reload
example@manager:~$ sudo systemctl restart docker
配置二
example@u180402:~$ cat /etc/docker/daemon.json
{
"registry-mirrors": [
"https://reg-mirror.qiniu.com",
"https://hub-mirror.c.163.com",
"https://registry.aliyuncs.com"
],
"hosts" : ["unix:///var/run/docker.sock", "tcp://0.0.0.0:2375"]
}
example@manager:~$ sudo vi /lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd
example@manager:~$ sudo systemctl daemon-reload
example@manager:~$ sudo systemctl restart docker
初始化集群
创建管理节点
example@manager:~$ docker swarm init --advertise-addr 172.18.74.26
Swarm initialized: current node (w78pv2cxmucv2vca3v5r069wt) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-1fffxrlpybn1oz0qsff9ywxuz7ef1o7v6c4qqf6kwvckt6bphi-6t9lfyat23n99do5y9mpdtdkg 192.168.1.154:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
初始化管理节点后,会创建两个新的网络docker_gwbridge、ingress
example@manager:~$ docker network ls
NETWORK ID NAME DRIVER SCOPE
6b3877ce1c6f bridge bridge local
6f5af407c445 docker_gwbridge bridge local
25066e8c0d9e host host local
p5dq2m8snezx ingress overlay swarm
b512147e5000 none null local
node 节点加入集群
#g160402
example@g160402:~$ docker swarm join --token SWMTKN-1-1fffxrlpybn1oz0qsff9ywxuz7ef1o7v6c4qqf6kwvckt6bphi-6t9lfyat23n99do5y9mpdtdkg 172.18.74.26:2377
This node joined a swarm as a worker.
#u180402
example@u180402:~$ docker swarm join --token SWMTKN-1-1fffxrlpybn1oz0qsff9ywxuz7ef1o7v6c4qqf6kwvckt6bphi-6t9lfyat23n99do5y9mpdtdkg 172.18.74.26:2377
This node joined a swarm as a worker.
管理节点查看节点状态
Active
:调度器能够安排任务到该节点Pause
:调度器不能够安排任务到该节点,但是已经存在的任务会继续运行Drain
:调度器不能够安排任务到该节点,而且会停止已存在的任务,并将这些任务分配到其他 Active 状态的节点example@manager:~$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
vrfif1jr3v0gl29o8okhdlc4l g160402 Ready Active 18.06.1-ce
w78pv2cxmucv2vca3v5r069wt * manager Ready Active Leader 18.09.5
7jjv186tvj8hscubg6me026vq u180402 Ready Active 18.06.1-ce
退出集群
example@u180402:~$ docker swarm leave
Node left the swarm.
example@g160402:~$ docker swarm leave
Node left the swarm.
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
vrfif1jr3v0gl29o8okhdlc4l g160402 Down Active 18.06.1-ce
w78pv2cxmucv2vca3v5r069wt * manager Ready Active Leader 18.09.5
7jjv186tvj8hscubg6me026vq u180402 Down Active 18.06.1-ce
#manager 强制退出集群
example@manager:~$ docker swarm leave --force
Node left the swarm.
在集群中启用服务
创建拥有两个副本的http服务
example@manager:~$ docker service create --replicas 2 --name hello-swarm httpd:latest
01voy53c0ygxb5w7ncocxwfvp
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
example@manager:~$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
01voy53c0ygx hello-swarm replicated 2/2 httpd:latest
example@manager:~$ docker service ps hello-swarm
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
qw0rfrbhgk5v hello-swarm.1 httpd:latest manager Running Running about a minute ago
byhnp23chffg hello-swarm.2 httpd:latest g160402 Running Running about a minute ago
example@g160402:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f9f928c906e4 httpd:latest "httpd-foreground" 2 minutes ago Up 2 minutes 80/tcp hello-swarm.2.byhnp23chffg59hbnpdndgp69
更新服务配置
example@manager:~$ docker service update --publish-add 8080:80 hello-swarm
hello-swarm
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
example@g160402:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8ed735b92841 httpd:latest "httpd-foreground" 13 seconds ago Up 11 seconds 80/tcp hello-swarm.2.0v51ok3f424iaziisc51tfq00
此时可以在浏览器访问任意服务器的8080端口都可以看到httpd运行成功的"It works"界面。
集群扩容
enee@manager:~$ docker service scale hello-swarm=4
hello-swarm scaled to 4
overall progress: 4 out of 4 tasks
1/4: running [==================================================>]
2/4: running [==================================================>]
3/4: running [==================================================>]
4/4: running [==================================================>]
enee@manager:~$ docker service ps hello-swarm
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
3o6rzluek155 hello-swarm.1 httpd:latest u180402 Running Running 5 minutes ago
qw0rfrbhgk5v \_ hello-swarm.1 httpd:latest manager Shutdown Shutdown 6 minutes ago
0v51ok3f424i hello-swarm.2 httpd:latest g160402 Running Running 6 minutes ago
byhnp23chffg \_ hello-swarm.2 httpd:latest g160402 Shutdown Shutdown 6 minutes ago
faitccodd7vq hello-swarm.3 httpd:latest manager Running Running 27 seconds ago
biqpebevezkj hello-swarm.4 httpd:latest manager Running Running 26 seconds ago
此时manager服务器运行两个http服务,u180402 1个,g160402 一个。
为服务添加目录映射,多次刷新页面,所访问的服务会随机分布在各运行容器
example@manager:~$ docker service update --mount-add type=bind,source=/home/example/temp/,destination=/usr/local/apache2/htdocs/ hello-swarm
hello-swarm
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
不更改任何配置,重启服务
example@g160402:~$ docker service update --force hello-swarm
hello-swarm
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
example@g160402:~$ docker service ps hello-swarm
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
x0j4ow0jozso hello-swarm.1 httpd:latest g160402 Running Running 2 minutes ago
b8g0xoo53w4a \_ hello-swarm.1 httpd:latest g160402 Shutdown Shutdown 2 minutes ago
q8l75pkn9r3x hello-swarm.2 httpd:latest g160402 Running Running 2 minutes ago
q28kvehhdcun \_ hello-swarm.2 httpd:latest g160402 Shutdown Shutdown 2 minutes ago
6nvq8ntrfs04 \_ hello-swarm.2 httpd:latest g160402 Shutdown Failed 20 minutes ago "task: non-zero exit (137)"
example@g160402:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
36fd1a6c3b28 httpd:latest "httpd-foreground" About a minute ago Up About a minute 80/tcp hello-swarm.1.x0j4ow0jozsomdxnnw5vkcv6s
6c4501017beb httpd:latest "httpd-foreground" About a minute ago Up About a minute 80/tcp hello-swarm.2.q8l75pkn9r3xy33g28llgzui5
删除服务
example@manager:~$ docker service rm hello-swarm
hello-swarm
让服务在指定节点上运行
为各节点添加标签
使用命令行添加、删除
example@manager:~$ docker node update --label-add role=manager manager
manager
example@manager:~$ docker node update --label-add role=worker1 g160402
g160402
example@manager:~$ docker node update --label-add role=worker2 u180402
example@manager:~$ docker node inspect g160402
......
"Spec": {
"Labels": {
"role": "worker1"
},
......
#删除节点标签
example@manager:~$ docker node update --label-rm role g160402
g160402
在docker-daemon中添加标签
example@manager:~$ sudo vi /lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd -H 0.0.0.0:2375 -H unix:///var/run/docker.sock --label hostname=manage
指定运行节点
example@manager:~$ docker service create --replicas 2 --constraint 'node.labels.role == worker1' --name hello-swarm httpd:latest
rfz6aocpi9bh4foq4wzw1bl3x
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
example@g160402:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b098a29fc83b httpd:latest "httpd-foreground" 7 seconds ago Up 6 seconds 80/tcp hello-swarm.1.b8g0xoo53w4adyvf9mdl1hozd
d2cfd7a650c3 httpd:latest "httpd-foreground" 7 seconds ago Up 6 seconds 80/tcp hello-swarm.2.6nvq8ntrfs04i1mx0wiy5f92h
容器异常退出或删除后,manager节点会再次启动新的服务,同时记录退出日志
example@manager:~$ docker service ps hello-swarm
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
b8g0xoo53w4a hello-swarm.1 httpd:latest g160402 Running Running 4 minutes ago
6nvq8ntrfs04 hello-swarm.2 httpd:latest g160402 Running Running 4 minutes ago
example@g160402:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b098a29fc83b httpd:latest "httpd-foreground" 3 minutes ago Up 3 minutes 80/tcp hello-swarm.1.b8g0xoo53w4adyvf9mdl1hozd
d2cfd7a650c3 httpd:latest "httpd-foreground" 3 minutes ago Up 3 minutes 80/tcp hello-swarm.2.6nvq8ntrfs04i1mx0wiy5f92h
example@g160402:~$ docker rm -f d2cfd7a650c3
d2cfd7a650c3
example@g160402:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b098a29fc83b httpd:latest "httpd-foreground" 4 minutes ago Up 4 minutes 80/tcp hello-swarm.1.b8g0xoo53w4adyvf9mdl1hozd
example@manager:~$ docker service ps hello-swarm
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
b8g0xoo53w4a hello-swarm.1 httpd:latest g160402 Running Running 5 minutes ago
q28kvehhdcun hello-swarm.2 httpd:latest g160402 Running Running 7 seconds ago
6nvq8ntrfs04 \_ hello-swarm.2 httpd:latest g160402 Shutdown Failed 13 seconds ago "task: non-zero exit (137)"
example@g160402:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d35b717ddc46 httpd:latest "httpd-foreground" 19 seconds ago Up 14 seconds 80/tcp hello-swarm.2.q28kvehhdcunpi3h5e4a12679
b098a29fc83b httpd:latest "httpd-foreground" 5 minutes ago Up 5 minutes 80/tcp hello-swarm.1.b8g0xoo53w4adyvf9mdl1hozd
节点的升级与降级
"MANAGER STATUS"状态说明:
Leader
:为群体做出所有群管理和编排决策的主要管理者节点Reachable
:如果 Leader 节点变为不可用,该节点有资格被选举为新的 LeaderUnavailable
:该节点不能和其他 Manager 节点产生任何联系,这种情况下,应该添加一个新的 Manager 节点到集群,或者将一个 Worker 节点提升为 Manager 节点将g160402节点升级,此时该节点可以执行manager节点的可执行命令,“MANAGER STATUS"变为“Reachable”
example@manager:~$ docker node promote g160402 u180402
Node g160402 promoted to a manager in the swarm.
Node u180402 promoted to a manager in the swarm.
example@g160402:~$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
kl6siwciwca88y6sp8mhku38p * g160402 Ready Active Reachable 18.06.1-ce
uyoiijq9vtdi9f6tvkr4wuqh9 manager Ready Active Leader 18.09.5
ffm3ttsc31l4tiwa4lyu7vol4 u180402 Ready Active 18.06.1-ce
example@g160402:~$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
kl6siwciwca88y6sp8mhku38p * g160402 Ready Active Leader 18.06.1-ce
uyoiijq9vtdi9f6tvkr4wuqh9 manager Unknown Active Unreachable 18.09.5
ffm3ttsc31l4tiwa4lyu7vol4 u180402 Ready Active Reachable 18.06.1-ce
节点降级
example@manager:~$ docker node demote g160402
Manager g160402 demoted in the swarm.
example@manager:~$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
kl6siwciwca88y6sp8mhku38p g160402 Ready Active 18.06.1-ce
uyoiijq9vtdi9f6tvkr4wuqh9 * manager Ready Active Leader 18.09.5
ffm3ttsc31l4tiwa4lyu7vol4 u180402 Ready Active 18.06.1-ce
Docker stack
指令用法
参数 | 说明 |
---|---|
deploy | 新建或更新一个栈 |
ls | 列出已存在的栈列表 |
ps | 列出栈堆中的任务 |
rm | 删除一个或多个栈 |
services | 列出栈堆中的服务 |
启动一个服务
example@manager:/data/@stack/giot$ pwd
/data/@stack/giot
example@manager:/data/@stack/giot$ ls
docker-compose.yml
#创建自定义网络
example@manager:~/docker$ docker network create --driver overlay giot_network
7sfjbimchcmhh1336v075y4d9
example@manager:/data/@stack/giot$ cat docker-compose.yml
version: "3"
services:
nginx:
image: nginx:1.15.8-alpine
deploy:
replicas: 2
resources:
limits:
cpus: "0.1"
memory: 50M
placement:
constraints:
- node.labels.role == worker1
restart_policy:
condition: on-failure
ports:
- 80:80/tcp
volumes:
- /data/containers/nginx/etc/nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- /data/containers/nginx/etc/nginx/conf.d:/etc/nginx/conf.d
- /dev/log:/dev/log
- /var/log/nginx:/var/log/nginx
- /data:/data
- /etc/localtime:/etc/localtime:ro
networks:
- giot_network
networks:
giot_network:
external: true
example@manager:/data/@stack/giot$ docker stack deploy -c docker-compose.yml giot
Creating network giot_default
Creating service giot_nginx
example@g160402:/data/containers/nginx$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b7b8f7d57a24 nginx:1.15.8-alpine "nginx -g 'daemon of…" 9 seconds ago Up 7 seconds 80/tcp test_nginx.1.x9262cydwiwr6au792z3m39xg
be5b8aae70ee nginx:1.15.8-alpine "nginx -g 'daemon of…" 9 seconds ago Up 7 seconds 80/tcp test_nginx.2.uf5s1xi537h6k6qkea5wunu3m
附录1:docker service 参数列表
简写 | 参数 | 参数类型 | 描述 | 默认值 |
---|---|---|---|---|
–config | config | 指定给服务的配置 | ||
–constraint | list | 约束条件 | ||
–container-label | list | 容器标签 | ||
–credential-spec | credential-spec | 托管服务帐户的凭证规范(限Windows) | ||
-d | –detach | 立即退出,而不是等待服务的收敛 | ||
–dns | list | 设置自定义的 DNS servers | ||
–dns-option | list | 设置 DNS 参数 | ||
–dns-search | list | 设置自定义的DNS搜索域 | ||
–endpoint-mode | string | 端点模式 (vip or dnsrr) | vip | |
–entrypoint | command | 覆盖镜像默认的 ENTRYPOINT | ||
-e | –env | list | 设置环境变量 | |
–env-file | list | 从文件中读取环境变量 | ||
–generic-resource | list | 用户定义的资源 | ||
–group | list | 为容器设置一个或多个不同的用户组 | ||
–health-cmd | string | 检查健康状况的命令行 | ||
–health-interval | duration | 健康检查的时间间隔 (ms/s/m/h) | ||
–health-retries | int | 报告不健康的连续失败次数 | ||
–health-start-period | duration | 在重新计数到不稳定之前,容器初始化的时间 (ms/s/m/h) | ||
–health-timeout | duration | 一次检查的最长允许时间 (ms/s/m/h) | ||
–host | list | 设置一个或多个 host-to-IP 映射 (host:ip) | ||
–hostname | string | 容器主机名 | ||
–isolation | string | 服务容器隔离模式 | ||
-l | –label | list | 服务标签 | |
–limit-cpu | decimal | CPUs 限制 | ||
–limit-memory | bytes | 内存限制 | ||
–log-driver | string | 服务的日志驱动 | ||
–log-opt | list | 日志驱动参数 | ||
–mode | string | 服务模式 (replicated or global) | replicated | |
–mount | mount | 将文件系统挂载到服务 | ||
–name | string | 服务名称 | ||
–network | network | 服务网络 | ||
–no-healthcheck | 禁用任何容器指定的健康检查 | |||
–no-resolve-image | 不要查询注册表来解决图像摘要和支持的平台 | |||
–placement-pref | pref | 添加偏好设置 | ||
-p | –publish | port | 发布一个端口作为节点端口 | |
-q | –quiet | 简化进度输出 | ||
–read-only | 将容器的根文件系统挂载为只读 | |||
–replicas | uint | 任务的数量(即容器副本数量) | 1 | |
–reserve-cpu | decimal | 保留 CPUs | ||
–reserve-memory | bytes | 保留内存 | ||
–restart-condition | string | 重启条件 (“none”、“on-failure”、“any”) | any | |
–restart-delay | duration | 重启延时(ns/us/ms/s/m/h) | 5s | |
–restart-max-attempts | uint | 放弃之前的最大重启次数 | ||
–restart-window | duration | 用于评估重新启动策略的窗口(ns/us/ms/s/m/h) | ||
–rollback-delay | duration | 任务回滚延时(ns/us/ms/s/m/h) | 0s | |
–rollback-failure-action | string | 回滚失败的操作(“pause”、“continue”) | pause | |
–rollback-max-failure-ratio | float | 在回滚期间容忍的故障率 | 0 | |
–rollback-monitor | duration | 每个任务回滚之后的持续时间以监控失败 (ns/us/ms/s/m/h) | 5s | |
–rollback-order | string | 回滚顺序 (“start-first”/“stop-first”) | stop-first | |
–rollback-parallelism | uint | 最大数量的任务同时回滚 (0 代表同时回滚所有) | 1 | |
–secret | secret | 指定给服务的安全机制 | ||
–stop-grace-period | duration | 结束一个容器之前等待的时间 (ns/us/ms/s/m/h) | 10s | |
–stop-signal | string | 停止容器的信号 | ||
-t | –tty | 分配一个 pseudo-TTY | ||
–update-delay | duration | 更新延迟时间 (ns/us/ms/s/m/h) | 0s | |
–update-failure-action | string | 更新失败的动作(“pause”、“continue”、“rollback”) | pause | |
–update-max-failure-ratio | float | 在更新期间容忍的失败率 | 0 | |
–update-monitor | duration | 每个任务更新后的持续时间以监控失败(ns/us/ms/s/m/h) | 5s | |
–update-order | string | 更新顺序 (“start-first”、“stop-first”) | stop-first | |
–update-parallelism | uint | 最大数量的任务同时更新(0 代表同时更新所有) | 1 | |
-u | –user | string | Username 或 UID (format: |
|
–with-registry-auth | 发送认证信息给 Swarm 代理 | |||
-w | –workdir | string | 容器内的工作目录 |