解决CentOS安装GitLab经常奔溃,Prometheus运行失败

背景

由于内部服务器安装GitLab,但是发型GitLab运行一段时间后就无法访问,重启后又可以

终极分析原因

由于之前升级python的原因,导致安装GitLab的时候安装prometheus失败

解决方案

查看GitLab状态

gitlab-ctl status

发现prometheus启动失败

run: gitaly: (pid 1508) 19980339s; run: log: (pid 1500) 19980339s
run: gitlab-monitor: (pid 1505) 19980339s; run: log: (pid 1496) 19980339s
run: gitlab-workhorse: (pid 1513) 19980339s; run: log: (pid 1510) 19980339s
run: logrotate: (pid 15567) 1913s; run: log: (pid 1502) 19980339s
run: nginx: (pid 1509) 19980339s; run: log: (pid 1498) 19980339s
run: node-exporter: (pid 1504) 19980339s; run: log: (pid 1494) 19980339s
run: postgres-exporter: (pid 1506) 19980339s; run: log: (pid 1497) 19980339s
run: postgresql: (pid 1532) 19980339s; run: log: (pid 1514) 19980339s
down: prometheus: 6s, normally up; run: log: (pid 17174) 1076s
run: sidekiq: (pid 1823) 19980328s; run: log: (pid 1499) 19980339s
run: unicorn: (pid 1512) 19980339s; run: log: (pid 1501) 19980339s

查看日志

tail -f -n 100 /var/log/gitlab/prometheus/current

发现LevelDB存在问题,提示需要修复

2018-09-26_13:23:42.79129 time="2018-09-26T21:23:42+08:00" level=info msg="Listening on localhost:11002" source="web.go:341"
2018-09-26_13:23:42.79331 time="2018-09-26T21:23:42+08:00" level=error msg="Could not open the fingerprint-to-metric index for archived series. Please try a 3rd party tool to repair LevelDB in directory "/var/opt/gitlab/prometheus/data/archived_fingerprint_to_metric". If unsuccessful or undesired, delete the whole directory and restart Prometheus for crash recovery. You will lose all archived time series." source="persistence.go:213"
2018-09-26_13:23:42.79333 time="2018-09-26T21:23:42+08:00" level=error msg="Error opening memory series storage: leveldb: manifest corrupted (field 'comparer'): missing [file=MANIFEST-000785]" source="main.go:192"

网上查找相关资料,说运行以下命令修复

sudo -u gitlab-prometheus python -c "import leveldb; leveldb.RepairDB('/var/opt/gitlab/prometheus/data/archived_fingerprint_to_metric')"

但是运行提示错误:

Traceback (most recent call last):
File "", line 1, in 
ImportError: No module named leveldb

需要先安装leveldb

pip install leveldb

然后再运行命令... 这时发现pip也出现错误

如提示pip相关错误,那可能是因为之前python升级问题,导致错误

这个时候你需要重新安装setuptools和pip,参考升级地址:

https://blog.csdn.net/uisoul/article/details/90216021

升级完成后,重新执行修复命令,就可以了

pip install leveldb
sudo -u gitlab-prometheus python -c "import leveldb; leveldb.RepairDB('/var/opt/gitlab/prometheus/data/archived_fingerprint_to_metric')"

你可能感兴趣的:(解决CentOS安装GitLab经常奔溃,Prometheus运行失败)