计一次podman无法启动问题排查

错误信息

[root@localhost ~]# podman start node
ERRO[0000] Error adding network: failed to allocate for range 0: 10.88.0.137 has been allocated to c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1, duplicate allocation is not allowed
ERRO[0000] Error while adding pod to CNI network "podman": failed to allocate for range 0: 10.88.0.137 has been allocated to c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1, duplicate allocation is not allowed
Error: unable to start container "node": error configuring network namespace for container c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1: failed to allocate for range 0: 10.88.0.137 has been allocated to c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1, duplicate allocation is not allowed

触发原因

  • 升级了podman
[root@localhost ~]# yum history
ID     | 命令行                   | 日期和时间       | 操作           | 更改
-------------------------------------------------------------------------------
     6 | reinstall podman         | 2020-05-10 21:22 | R              |    2
     5 | update podman            | 2020-05-08 00:54 | I, U           |    9 EE
     4 | install telnet           | 2019-12-26 14:34 | Install        |    1
     3 |                          | 2019-11-30 01:27 | Install        |    1
     2 |                          | 2019-11-30 01:26 | I, U           |    2
     1 |                          | 2019-11-30 01:04 | Install        | 1318 EE
[root@localhost ~]# yum history list podman
ID     | 命令行                   | 日期和时间       | 操作           | 更改
-------------------------------------------------------------------------------
     6 | reinstall podman         | 2020-05-10 21:22 | R              |    2
     5 | update podman            | 2020-05-08 00:54 | I, U           |    9 E<
     2 |                          | 2019-11-30 01:26 | I, U           |    2 >
     1 |                          | 2019-11-30 01:04 | Install        | 1318 EE
[root@localhost ~]# yum history info 5
事务 ID: 5
起始时间    : 2020年05月08日 星期五 00时54分47秒
起始 RPM 数据库     : 1306:dd4746f34319aa93d15ffd57fc28167886c5e1ea
结束时间       : 2020年05月08日 星期五 00时54分54秒 (7 秒)
结束 RPM 数据库      : 1309:c49b6de09a868c80c4e4ba4cf4a8e065c52a2702
用户           : root <root>
返回码    : 成功
Releasever     : 8
命令行   : update podman
已改变的包:
    安装     conmon-2:2.0.6-1.module_el8.1.0+298+41f9343a.x86_64                              @AppStream
    安装     podman-manpages-1.6.4-4.module_el8.1.0+298+41f9343a.noarch                       @AppStream
    安装     libvarlink-18-3.el8.x86_64                                                       @BaseOS
    Upgrade  containernetworking-plugins-0.8.3-4.module_el8.1.0+298+41f9343a.x86_64           @AppStream
    Upgraded containernetworking-plugins-0.7.4-3.git9ebe139.module_el8.0.0+58+91b614e7.x86_64 @@System
    Upgrade  podman-1.6.4-4.module_el8.1.0+298+41f9343a.x86_64                                @AppStream
    Upgraded podman-1.0.5-1.gitf604175.module_el8.0.0+194+ac560166.x86_64                     @@System
    Upgrade  podman-docker-1.6.4-4.module_el8.1.0+298+41f9343a.noarch                         @AppStream
    Upgraded podman-docker-1.0.5-1.gitf604175.module_el8.0.0+194+ac560166.noarch              @@System
    Upgrade  runc-1.0.0-64.rc9.module_el8.1.0+298+41f9343a.x86_64                             @AppStream
    Upgraded runc-1.0.0-55.rc5.dev.git2abd837.module_el8.0.0+58+91b614e7.x86_64               @@System
    Upgrade  slirp4netns-0.4.2-3.git21fdece.module_el8.1.0+298+41f9343a.x86_64                @AppStream
    Upgraded slirp4netns-0.1-2.dev.gitc4e1bc5.module_el8.0.0+58+91b614e7.x86_64               @@System
    Upgrade  libseccomp-2.4.1-1.el8.x86_64                                                    @BaseOS
    Upgraded libseccomp-2.3.3-3.el8.x86_64                                                    @@System
Scriptlet 输出:
   1 /var/tmp/rpm-tmp.hW2QKt:行1: /usr/bin/podman: 权限不够
[root@localhost ~]# yum history info 2
事务 ID: 2
起始时间    : 2019年11月30日 星期六 01时26分17秒
起始 RPM 数据库     : 1303:6605b8b7fc9e4d18455acb1314ea9477b94b0439
结束时间       : 2019年11月30日 星期六 01时26分20秒 (3 秒)
结束 RPM 数据库      : 1304:6eb00bd598038a50d03f4350bd30521a7d697fb2
用户           : root <root>
返回码    : 成功
Releasever     :
命令行   :
已改变的包:
    安装     podman-docker-1.0.5-1.gitf604175.module_el8.0.0+194+ac560166.noarch @AppStream
    Upgrade  podman-1.0.5-1.gitf604175.module_el8.0.0+194+ac560166.x86_64        @AppStream
    Upgraded podman-1.0.0-2.git921f98f.module_el8.0.0+58+91b614e7.x86_64         @@System

[root@localhost ~]# yum info podman
上次元数据过期检查:0:17:56 前,执行于 2020年05月10日 星期日 21时21分10秒。
已安装的软件包
名称         : podman
版本         : 1.6.4
发布         : 4.module_el8.1.0+298+41f9343a
架构         : x86_64
大小         : 55 M
源           : podman-1.6.4-4.module_el8.1.0+298+41f9343a.src.rpm
仓库         : @System
来自仓库     : AppStream
小结         : Manage Pods, Containers and Container Images
URL          : https://podman.io/
协议         : ASL 2.0
描述         : podman (Pod Manager) is a fully featured container engine that is a simple daemonless tool.  podman provides a Docker-CLI comparable command line that eases the transition from other container engines and allows the management of pods, containers and
             : images.  Simply put: alias docker=podman.  Most podman commands can be run as a regular user, without requiring additional privileges.
             :
             : podman uses Buildah(1) internally to create container images. Both tools share image (not container) storage, hence each can use or manipulate images (but not containers) created by the other.
             :
             : Manage Pods, Containers and Container Images
             : libpod Simple management tool for pods, containers and images
  • 关闭又开启了firewalld,并开放了tcp端口,[不应该开启firewall,docker走的是iptable系的]
  • selinux开放了ssh端口
    semanage port -a -t ssh_port_t -p tcp

podman详情

[root@localhost ~]# podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.13.4
  podman version: 1.6.4
host:
  BuildahVersion: 1.12.0-dev
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.6-1.module_el8.1.0+298+41f9343a.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 2721f230f94894671f141762bd0d1af2fb263239'
  Distribution:
    distribution: '"centos"'
    version: "8"
  MemFree: 5398528
  MemTotal: 500600832
  OCIRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module_el8.1.0+298+41f9343a.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 1979969536
  SwapTotal: 2147479552
  arch: amd64
  cpus: 1
  eventlogger: journald
  hostname: localhost.localdomain
  kernel: 4.18.0-80.el8.x86_64
  os: linux
  rootless: false
  uptime: 9m 28.96s
registries:
  blocked: null
  insecure: null
  search:
  - registry.redhat.io
  - quay.io
  - docker.io
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 23
  GraphDriverName: overlay
  GraphOptions: {}
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 30
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

问题排查

1) 切换老版本

[root@localhost ~]# yum downgrade podman
上次元数据过期检查:0:20:56 前,执行于 2020年05月10日 星期日 21时21分10秒。
软件包 podman 的最低版本已经安装,无法再进行降级。
错误:没有标记要降级的软件包。

无法自动降级

[root@localhost ~]# yum search --showduplicates podman
上次元数据过期检查:0:23:51 前,执行于 2020年05月10日 星期日 21时21分10秒。
============================================================================================================================ 名称 精准匹配:podman ============================================================================================================================
podman-1.6.4-4.module_el8.1.0+298+41f9343a.x86_64 : Manage Pods, Containers and Container Images
podman-1.6.4-4.module_el8.1.0+298+41f9343a.x86_64 : Manage Pods, Containers and Container Images
========================================================================================================================== 小结 和 名称 匹配:podman ==========================================================================================================================
python-podman-api-1.2.0-0.2.gitd0a45fe.module_el8.1.0+298+41f9343a.noarch : Podman API
podman-tests-1.6.4-4.module_el8.1.0+298+41f9343a.x86_64 : Tests for podman
podman-docker-1.6.4-4.module_el8.1.0+298+41f9343a.noarch : Emulate Docker CLI using podman
podman-docker-1.6.4-4.module_el8.1.0+298+41f9343a.noarch : Emulate Docker CLI using podman
podman-manpages-1.6.4-4.module_el8.1.0+298+41f9343a.noarch : Man pages for the podman commands
podman-manpages-1.6.4-4.module_el8.1.0+298+41f9343a.noarch : Man pages for the podman commands
cockpit-podman-11-1.module_el8.1.0+298+41f9343a.noarch : Cockpit component for Podman containers
pcp-pmda-podman-4.3.2-2.el8.x86_64 : Performance Co-Pilot (PCP) metrics for podman containers
pcp-pmda-podman-4.3.2-3.el8_1.x86_64 : Performance Co-Pilot (PCP) metrics for podman containers
podman-remote-1.6.4-4.module_el8.1.0+298+41f9343a.x86_64 : (Experimental) Remote client for managing podman containers
============================================================================================================================== 小结 匹配:podman ==============================================================================================================================
toolbox-0.0.4-1.module_el8.1.0+293+ad8ef41f.x86_64 : Script to launch privileged container with podman

使用yum回退

[root@localhost ~]# yum history podman
ID     | 命令行                   | 日期和时间       | 操作           | 更改
-------------------------------------------------------------------------------
     6 | reinstall podman         | 2020-05-10 21:22 | R              |    2
     5 | update podman            | 2020-05-08 00:54 | I, U           |    9 E<
     2 |                          | 2019-11-30 01:26 | I, U           |    2 >
     1 |                          | 2019-11-30 01:04 | Install        | 1318 EE
[root@localhost ~]# yum history undo 5
上次元数据过期检查:0:30:11 前,执行于 2020年05月10日 星期日 21时21分10秒。
撤销事务 5,从 2020年05月08日 星期五 00时54分47秒
    安装     conmon-2:2.0.6-1.module_el8.1.0+298+41f9343a.x86_64                              @AppStream
    安装     podman-manpages-1.6.4-4.module_el8.1.0+298+41f9343a.noarch                       @AppStream
    安装     libvarlink-18-3.el8.x86_64                                                       @BaseOS
    Upgrade  containernetworking-plugins-0.8.3-4.module_el8.1.0+298+41f9343a.x86_64           @AppStream
    Upgraded containernetworking-plugins-0.7.4-3.git9ebe139.module_el8.0.0+58+91b614e7.x86_64 @@System
    Upgrade  podman-1.6.4-4.module_el8.1.0+298+41f9343a.x86_64                                @AppStream
    Upgraded podman-1.0.5-1.gitf604175.module_el8.0.0+194+ac560166.x86_64                     @@System
    Upgrade  podman-docker-1.6.4-4.module_el8.1.0+298+41f9343a.noarch                         @AppStream
    Upgraded podman-docker-1.0.5-1.gitf604175.module_el8.0.0+194+ac560166.noarch              @@System
    Upgrade  runc-1.0.0-64.rc9.module_el8.1.0+298+41f9343a.x86_64                             @AppStream
    Upgraded runc-1.0.0-55.rc5.dev.git2abd837.module_el8.0.0+58+91b614e7.x86_64               @@System
    Upgrade  slirp4netns-0.4.2-3.git21fdece.module_el8.1.0+298+41f9343a.x86_64                @AppStream
    Upgraded slirp4netns-0.1-2.dev.gitc4e1bc5.module_el8.0.0+58+91b614e7.x86_64               @@System
    Upgrade  libseccomp-2.4.1-1.el8.x86_64                                                    @BaseOS
    Upgraded libseccomp-2.3.3-3.el8.x86_64                                                    @@System
无可用软件包 containernetworking-plugins-0.7.4-3.git9ebe139.module_el8.0.0+58+91b614e7.x86_64。
无可用软件包 libseccomp-2.3.3-3.el8.x86_64。
无可用软件包 podman-1.0.5-1.gitf604175.module_el8.0.0+194+ac560166.x86_64。
无可用软件包 podman-docker-1.0.5-1.gitf604175.module_el8.0.0+194+ac560166.noarch。
无可用软件包 runc-1.0.0-55.rc5.dev.git2abd837.module_el8.0.0+58+91b614e7.x86_64。
无可用软件包 slirp4netns-0.1-2.dev.gitc4e1bc5.module_el8.0.0+58+91b614e7.x86_64。
错误:没有能够与之匹配的软件包

失败,官方源已经删除了老版本rpm包,需要自己去找rpm,没必要,去官网看看这个报错原因,无果
这个报错的本身意思是容器id的ip已经分配给自己,而不能再重新分配给自己。。应该是强制关机导致文件lock残留的问题,但这应该也是一个bug,podman应该有自己修复的功能

查找残留文件删除,尝试删除run目录的container失败

[root@localhost storage]# find / -name c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
/run/containers/storage/overlay-containers/c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
/var/lib/containers/storage/overlay-containers/c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
[root@localhost storage]# cd /run/containers/storage/overlay-containers/c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
[root@localhost c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1]# tree
.
└── userdata

1 directory, 0 files
[root@localhost c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1]# cd ..
[root@localhost overlay-containers]# rm -rf c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1/

去查看该容器配置

[root@localhost c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1]#cd /var/lib/containers/storage/overlay-containers/c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
[root@localhost c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1]# cat userdata/config.json |json_reformat |grep -C 5  network
        "namespaces": [
            {
                "type": "pid"
            },
            {
                "type": "network",
                "path": "/var/run/netns/cni-c27fac7a-fe5a-36bb-3312-3dfdd8747633"
            },
            {
                "type": "ipc"
            },
#找到网络配置地方/var/run/netns/cni-c27fac7a-fe5a-36bb-3312-3dfdd8747633发现不存在目录,失败,去看源码

最终操作

治标

[root@localhost ~]# podman start node
ERRO[0000] Error adding network: failed to allocate for range 0: 10.88.0.137 has been allocated to c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1, duplicate allocation is not allowed
ERRO[0000] Error while adding pod to CNI network "podman": failed to allocate for range 0: 10.88.0.137 has been allocated to c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1, duplicate allocation is not allowed
Error: unable to start container "node": error configuring network namespace for container c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1: failed to allocate for range 0: 10.88.0.137 has been allocated to c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1, duplicate allocation is not allowed
#根据错误信息的ip
[root@localhost containers]# cat /var/lib/cni/networks/podman/10.88.0.137
c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
#重置该虚拟ip
[root@localhost containers]# echo > /var/lib/cni/networks/podman/10.88.0.137
[root@localhost containers]# podman restart c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
Error: cannot chown run directory /var/run/containers/storage/overlay-containers/c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1/userdata: chown /var/run/containers/storage/overlay-containers/c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1/userdata: no such file or directory
#重建该目录
[root@localhost containers]# mkdir  -p /var/run/containers/storage/overlay-containers/c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1/userdata
[root@localhost containers]# podman restart c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
c17c6b3c9abad320bd7009b261e77af9c2474aa16acf5131f8c47f56a81e52c1
#启动成功
#该方法治标不治本,具体解决方案还要再找找

治本

相关文档

说明 url 其他
podman命令行源码 https://github.com/containers/libpod 源码
容器网络实现源码 https://github.com/containernetworking/cni 源码
同类型问题 https://github.com/containers/dnsname/issues/19 检索语句org:containers "duplicate allocation"该问题还在开启,应该是新的bug,所以没有重复提交
路途开始的地方 长链接 这个报错的位置

最终判断firewall和iptable并存导致该问题
systemctl disable firewalld关闭firewall防火墙,问题解决
以为docker->podman,centos7->centos8能支持一下firewall,但容器果然还是iptable系的软件

你可能感兴趣的:(速记)