no suitable node (insufficient resources on 1 node) docker stack/service

Docker Engine版本

Client: Docker Engine - Community
 Version:           19.03.8
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        afacb8b
 Built:             Wed Mar 11 01:27:04 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.8
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       afacb8b
  Built:            Wed Mar 11 01:25:42 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

      在通过docker stack deploy 或者 docker service update命令更新服务时候,操作成功但是出现:no suitable node (insufficient resources on 1 node), 命令执行结束但是没有立即更新服务,命令执行state状态为pending,挂起,ERROR显示原因,原因为节点资源不足导致无法更新服务。(本例中stack.yaml,更新策略是start_first,先启动一个服务,再暂停原服务)

[root@abc abc]# docker service ps demo_services_hi-srv 
ID                  NAME                              IMAGE               NODE                DESIRED STATE       CURRENT STATE         ERROR                              PORTS
ea97n0dfeggm        demo_services_hi-srv.1       hi-srv:latest                          Running             Pending 4 hours ago   "no suitable node (insufficien…"  

       当一个节点上运行的docker容器服务较多时,虽然通过free命令查看宿主机内存还有可用,但是在执行service更新时也无法立即更新,会一直处于pending状态,等待有资源了会被调度继续执行更新命令,free还有资源但是无法更新的最主要原因是所有已启动的docker容器预分配的内存已经将宿主机内存资源占用光了,所以内存不足导致无法启动/更新service。

判断是否还有可用资源:宿主机总内存 - (每个docker容器limit-memory+reservation-memory) = 剩余内存资源,当启动的新服务或者要更新的服务所需内存资源(limit-memory+reservation-memory) > 剩余内存资源, 执行命令出现state: pending状态,ERROR显示消息为:no suitable node (insufficient resources on 1 node)。需要增大宿主机内存或者增加集群节点或者减少已存在容器占用的内存,腾出部分资源用于更新或者启动新服务。

limit-memory :限制容器运行占用的系统内存

reservation-memory :为该容器保留的内存,默认0

services:
  hi-srv:
	...省略...
    deploy:
	  ...省略...
	  update_config:
	    parallelism: 1
	    delay: 30s
	    order: start-first
	    failure_action: rollback
	  resources:
		limits:
		  cpus: '0.9'
		  memory: ${limit_mem:-1536M}
		reservations:
		  cpus: '0.1'
	 	  memory: ${reserve_mem:-64M}

参考:#issues25069

 

 

你可能感兴趣的:(Docker)