在巡检rancher平台时发现有一个服务运行报错了,查看该服务容器事件时提示连接不到harbor镜像仓库。
发现无法访问时,第一时间是通过浏览器去访问harbor仓库是否能正常访问:
http://harbor.jx.shu.com
发现无法访问,然后登入到对应的harbor服务器上去查看harbor服务是否正常。
通过堡垒机去访问harbor服务器时发现无法登入上去,这时候就需要找硬件基础工程师进行处理了。
硬件工程师处理好harbor服务器无法登入的问题,之后登入到harbor服务器上去,并通过docker-compose命令查看服务运行状态,如下:
root@harbor:/home/service/harbor# docker-compose ps Name Command State Ports ----------------------------------------------------------------------------------------------------------------------------------- harbor-adminserver /harbor/start.sh Up harbor-core /harbor/start.sh Up harbor-db /entrypoint.sh postgres Up 5432/tcp harbor-jobservice /harbor/start.sh Up harbor-log /bin/sh -c /usr/local/bin/ ... Up 127.0.0.1:1514->10514/tcp harbor-portal nginx -g daemon off; Restarting nginx nginx -g daemon off; Up 0.0.0.0:443->443/tcp, 0.0.0.0:4443->4443/tcp, 0.0.0.0:80->80/tcp redis docker-entrypoint.sh redis ... Up 6379/tcp registry /entrypoint.sh /etc/regist ... Up 5000/tcp registryctl /harbor/start.sh Up
根据查询到情况,可以发现harbor-portal容器服务运行异常,然后查看harbor对应的yaml文件内容:
version: '2' services: log: image: goharbor/harbor-log:v1.7.1 container_name: harbor-log restart: always dns_search: . cap_drop: - ALL cap_add: - CHOWN - DAC_OVERRIDE - SETGID - SETUID volumes: - /var/log/harbor/:/var/log/docker/:z - ./common/config/log/:/etc/logrotate.d/:z ports: - 127.0.0.1:1514:10514 networks: - harbor registry: image: goharbor/registry-photon:v2.6.2-v1.7.1 container_name: registry restart: always cap_drop: - ALL cap_add: - CHOWN - SETGID - SETUID volumes: - /data/registry:/storage:z - ./common/config/registry/:/etc/registry/:z - ./common/config/custom-ca-bundle.crt:/harbor_cust_cert/custom-ca-bundle.crt:z networks: - harbor dns_search: . depends_on: - log logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "registry" registryctl: image: goharbor/harbor-registryctl:v1.7.1 container_name: registryctl env_file: - ./common/config/registryctl/env restart: always cap_drop: - ALL cap_add: - CHOWN - SETGID - SETUID volumes: - /data/registry:/storage:z - ./common/config/registry/:/etc/registry/:z - ./common/config/registryctl/config.yml:/etc/registryctl/config.yml:z networks: - harbor dns_search: . depends_on: - log logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "registryctl" postgresql: image: goharbor/harbor-db:v1.7.1 container_name: harbor-db restart: always cap_drop: - ALL cap_add: - CHOWN - DAC_OVERRIDE - SETGID - SETUID volumes: - /data/database:/var/lib/postgresql/data:z networks: - harbor dns_search: . env_file: - ./common/config/db/env depends_on: - log logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "postgresql" adminserver: image: goharbor/harbor-adminserver:v1.7.1 container_name: harbor-adminserver env_file: - ./common/config/adminserver/env restart: always cap_drop: - ALL cap_add: - CHOWN - SETGID - SETUID volumes: - /data/config/:/etc/adminserver/config/:z - /data/secretkey:/etc/adminserver/key:z - /data/:/data/:z networks: - harbor dns_search: . depends_on: - log logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "adminserver" core: image: goharbor/harbor-core:v1.7.1 container_name: harbor-core env_file: - ./common/config/core/env restart: always cap_drop: - ALL cap_add: - SETGID - SETUID volumes: - ./common/config/core/app.conf:/etc/core/app.conf:z - ./common/config/core/private_key.pem:/etc/core/private_key.pem:z - ./common/config/core/certificates/:/etc/core/certificates/:z - /data/secretkey:/etc/core/key:z - /data/ca_download/:/etc/core/ca/:z - /data/psc/:/etc/core/token/:z - /data/:/data/:z networks: - harbor dns_search: . depends_on: - log - adminserver - registry logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "core" portal: image: goharbor/harbor-portal:v1.7.1 container_name: harbor-portal restart: always cap_drop: - ALL cap_add: - CHOWN - SETGID - SETUID - NET_BIND_SERVICE networks: - harbor dns_search: . depends_on: - log - core logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "portal" jobservice: image: goharbor/harbor-jobservice:v1.7.1 container_name: harbor-jobservice env_file: - ./common/config/jobservice/env restart: always cap_drop: - ALL cap_add: - CHOWN - SETGID - SETUID volumes: - /data/job_logs:/var/log/jobs:z - ./common/config/jobservice/config.yml:/etc/jobservice/config.yml:z networks: - harbor dns_search: . depends_on: - redis - core - adminserver logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "jobservice" redis: image: goharbor/redis-photon:v1.7.1 container_name: redis restart: always cap_drop: - ALL cap_add: - CHOWN - SETGID - SETUID volumes: - /data/redis:/var/lib/redis networks: - harbor dns_search: . depends_on: - log logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "redis" proxy: image: goharbor/nginx-photon:v1.7.1 container_name: nginx restart: always cap_drop: - ALL cap_add: - CHOWN - SETGID - SETUID - NET_BIND_SERVICE volumes: - ./common/config/nginx:/etc/nginx:z networks: - harbor dns_search: . ports: - 80:80 - 443:443 - 4443:4443 depends_on: - postgresql - registry - core - portal - log logging: driver: "syslog" options: syslog-address: "tcp://127.0.0.1:1514" tag: "proxy" networks: harbor: external: false
发现harbor-portal容器服务的日志是存放在/var/log/harbor目录下的,需要到该目录找对应服务的日志内容:
root@harbor:/home/service/harbor# cd /var/log/harbor/ root@harbor:/var/log/harbor# ls adminserver.log dev-198-1b9b616909af44f90ae1566674c19032ec13c8da.log portal.log proxy.log registryctl.log core.log jobservice.log postgresql.log redis.log registry.log
root@harbor:/var/log/harbor# tail -100f portal.log
查看的日志报错如下:
........ Feb 18 14:43:15 localhost portal[97624]: 2024/02/18 06:43:15 [emerg] 1#0: mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:43:15 localhost portal[97624]: nginx: [emerg] mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:43:41 localhost portal[97624]: 2024/02/18 06:43:41 [emerg] 1#0: mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:43:41 localhost portal[97624]: nginx: [emerg] mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:44:32 172.18.0.1 portal[97624]: 2024/02/18 06:44:32 [emerg] 1#0: mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:44:32 172.18.0.1 portal[97624]: nginx: [emerg] mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:46:15 172.18.0.1 portal[97624]: 2024/02/18 06:46:15 [emerg] 1#0: mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:46:15 172.18.0.1 portal[97624]: nginx: [emerg] mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:49:40 localhost portal[97624]: 2024/02/18 06:49:40 [emerg] 1#0: mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied) Feb 18 14:49:40 localhost portal[97624]: nginx: [emerg] mkdir() "/etc//nginx/client_body_temp" failed (13: Permission denied)
但是该服务运行的不是nginx服务的,所以日志中报错的内容,提供不了排查方向。
首先想到harbor服务器有重启过,是否有可能因为docker的网络有问题导致的,需要重启一下docker服务,然后再去重启一下harbor服务的。
root@harbor:/home/service/harbor# docker-compose stop Stopping nginx ... done Stopping harbor-jobservice ... done Stopping harbor-portal ... done Stopping harbor-core ... done Stopping redis ... done Stopping registryctl ... done Stopping registry ... done Stopping harbor-db ... done Stopping harbor-adminserver ... done Stopping harbor-log ... done root@harbor:/home/service/harbor# systemctl restart docker root@harbor:/home/service/harbor# docker-compose start Starting log ... done Starting postgresql ... done Starting redis ... done Starting adminserver ... done Starting registry ... done Starting core ... done Starting jobservice ... done Starting portal ... done Starting proxy ... done Starting registryctl ... done root@harbor:/home/service/harbor# docker-compose ps Name Command State Ports ----------------------------------------------------------------------------------------------------------------------------------- harbor-adminserver /harbor/start.sh Up harbor-core /harbor/start.sh Up harbor-db /entrypoint.sh postgres Up 5432/tcp harbor-jobservice /harbor/start.sh Up harbor-log /bin/sh -c /usr/local/bin/ ... Up 127.0.0.1:1514->10514/tcp harbor-portal nginx -g daemon off; Restarting nginx nginx -g daemon off; Up 0.0.0.0:443->443/tcp, 0.0.0.0:4443->4443/tcp, 0.0.0.0:80->80/tcp redis docker-entrypoint.sh redis ... Up 6379/tcp registry /entrypoint.sh /etc/regist ... Up 5000/tcp registryctl /harbor/start.sh Up
然后发现还是不行,网络查找可以尝试通过install.sh脚本的访问处理,就尝试了一下:
root@harbor:/home/service/harbor# ./install.sh [Step 0]: checking installation environment ... Note: docker version: 17.03.2 Note: docker-compose version: 1.18.0 [Step 1]: preparing environment ... Clearing the configuration file: ./common/config/registryctl/env Clearing the configuration file: ./common/config/registryctl/config.yml Clearing the configuration file: ./common/config/db/env Clearing the configuration file: ./common/config/core/env Clearing the configuration file: ./common/config/core/app.conf Clearing the configuration file: ./common/config/core/private_key.pem Clearing the configuration file: ./common/config/log/logrotate.conf Clearing the configuration file: ./common/config/adminserver/env Clearing the configuration file: ./common/config/registry/config.yml Clearing the configuration file: ./common/config/registry/root.crt Clearing the configuration file: ./common/config/nginx/nginx.conf Clearing the configuration file: ./common/config/jobservice/env Clearing the configuration file: ./common/config/jobservice/config.yml loaded secret from file: /data/secretkey Generated configuration file: ./common/config/nginx/nginx.conf Generated configuration file: ./common/config/adminserver/env Generated configuration file: ./common/config/core/env Generated configuration file: ./common/config/registry/config.yml Generated configuration file: ./common/config/db/env Generated configuration file: ./common/config/jobservice/env Generated configuration file: ./common/config/jobservice/config.yml Generated configuration file: ./common/config/log/logrotate.conf Generated configuration file: ./common/config/registryctl/env Generated configuration file: ./common/config/core/app.conf Generated certificate, key file: ./common/config/core/private_key.pem, cert file: ./common/config/registry/root.crt The configuration files are ready, please use docker-compose to start the service. [Step 2]: checking existing instance of Harbor ... Note: stopping existing Harbor instance ... Stopping nginx ... done Stopping harbor-jobservice ... done Stopping harbor-portal ... done Stopping harbor-core ... done Stopping redis ... done Stopping registryctl ... done Stopping registry ... done Stopping harbor-db ... done Stopping harbor-adminserver ... done Stopping harbor-log ... done Removing nginx ... done Removing harbor-jobservice ... done Removing harbor-portal ... done Removing harbor-core ... done Removing redis ... done Removing registryctl ... done Creating harbor-log ... done Removing harbor-db ... done Removing harbor-adminserver ... done Removing harbor-log ... done Removing network harbor_harbor Creating harbor-db ... done Creating harbor-core ... done [Step 3]: starting Harbor ... Creating harbor-portal ... done Creating nginx ... done Creating redis ... Creating harbor-adminserver ... Creating registry ... Creating harbor-db ... Creating registryctl ... Creating harbor-core ... Creating harbor-portal ... Creating harbor-jobservice ... Creating nginx ... ✔ ----Harbor has been installed and started successfully.---- Now you should be able to visit the admin portal at http://harbor.jxwrd.gov.cn. For more details, please visit https://github.com/goharbor/harbor .
再次查看harbor服务状态:
root@harbor:/home/service/harbor# docker-compose ps Name Command State Ports ------------------------------------------------------------------------------------------------------------------------------ harbor-adminserver /harbor/start.sh Up harbor-core /harbor/start.sh Up harbor-db /entrypoint.sh postgres Up 5432/tcp harbor-jobservice /harbor/start.sh Up harbor-log /bin/sh -c /usr/local/bin/ ... Up 127.0.0.1:1514->10514/tcp harbor-portal nginx -g daemon off; Up 80/tcp nginx nginx -g daemon off; Up 0.0.0.0:443->443/tcp, 0.0.0.0:4443->4443/tcp, 0.0.0.0:80->80/tcp redis docker-entrypoint.sh redis ... Up 6379/tcp registry /entrypoint.sh /etc/regist ... Up 5000/tcp registryctl /harbor/start.sh Up
发现harbor服务恢复了,通过访问地址去访问是可以正常打开,并且正常获取到镜像文件的。
到此,该harbor访问异常的问题就处理好了,希望问题分析排查的过程对大家有帮助!