注意检查ambari-server和ambari-agent两个日志文件
直接参考 https://community.hortonworks.com/questions/121978/openssl-compatibility.html?childToView=138080#answer-138080
在部署集群时遇到Host checks were skipped on 4 hosts that failed to register.
进度条全部failed。
使用如下命令查看日志,并未发现有报错
cat /var/log/ambari-server/ambari-server.log
仅出现如下(摘自https://www.cnblogs.com/barneywill/p/10273687.html
2019-01-15 12:03:45,452 INFO [ambari-client-thread-44] AmbariMetaInfo:1430 - Stack HDP-2.0 is not active, skipping VDF
2019-01-15 12:03:45,452 INFO [ambari-client-thread-44] AmbariMetaInfo:1430 - Stack HDP-2.0.6 is not active, skipping VDF
2019-01-15 12:03:45,452 INFO [ambari-client-thread-44] AmbariMetaInfo:1430 - Stack HDP-2.0.6.GlusterFS is not active, skipping VDF
2019-01-15 12:03:45,452 INFO [ambari-client-thread-44] AmbariMetaInfo:1430 - Stack HDP-2.1 is not active, skipping VDF
2019-01-15 12:03:45,452 INFO [ambari-client-thread-44] AmbariMetaInfo:1430 - Stack HDP-2.1.GlusterFS is not active, skipping VDF
2019-01-15 12:03:45,452 INFO [ambari-client-thread-44] AmbariMetaInfo:1430 - Stack HDP-2.2 is not active, skipping VDF
2019-01-15 12:03:45,453 INFO [ambari-client-thread-44] AmbariMetaInfo:1430 - Stack HDP-2.3 is not active, skipping VDF
2019-01-15 12:03:45,453 INFO [ambari-client-thread-44] AmbariMetaInfo:1430 - Stack HDP-2.3.ECS is not active, skipping VDF
2019-01-15 12:03:45,453 INFO [ambari-client-thread-44] AmbariMetaInfo:1428 - Stack HDP-2.3.GlusterFS is not valid, skipping VDF: The service 'OOZIE' in stack 'HDP:2.3.GlusterFS' extends a non-existent service: 'common-services/OOZIE/5.0.0.2.3'
我自己的log已被覆盖
并提示如下,说明集群并没有被创建。
INFO [pool-18-thread-1] AmbariMetricSinkImpl:95 - No clusters configured.
照着上面的博客对着metainfo.xml文件进行修改,将false改为true,重启ambari-server后,卡在了preparing阶段
/var/lib/ambari-server/resources/stacks/HDP/$version/metainfo.xml
cat /var/lib/ambari-server/resources/stacks/HDP/2.6/metainfo.xml
false
2.5
1.7
1.8
查看日志后发现
28 三月 2019 14:16:53,750 INFO [Thread-36] BSRunner:372 - Error executing bootstrap Cannot create /var/run/ambari-server/bootstrap/1
28 三月 2019 14:16:53,750 ERROR [Thread-36] BSRunner:441 - java.io.FileNotFoundException: /var/run/ambari-server/bootstrap/1/hdpmaster.done (没有那个文件或目录)
28 三月 2019 14:16:53,750 WARN [Thread-36] BSRunner:401 - File does not exist: /var/run/ambari-server/bootstrap/1/sshKey
照着网上的方法手动创建相关文件和文件夹后,无果
//此时我已经创建了前4个文件夹
28 三月 2019 14:16:53,750 INFO [Thread-36] BSRunner:372 - Error executing bootstrap Cannot create /var/run/ambari-server/bootstrap/5
28 三月 2019 14:16:53,750 ERROR [Thread-36] BSRunner:441 - java.io.FileNotFoundException: /var/run/ambari-server/bootstrap/5/hdpmaster.done (没有那个文件或目录)
28 三月 2019 14:16:53,750 WARN [Thread-36] BSRunner:401 - File does not exist: /var/run/ambari-server/bootstrap/5/sshKey
索性重置了ambari-server,在看到https://www.oschina.net/question/2956577_2201031
这篇博客时,发现自己漏查了ambari-agent的日志,使用如下命令进入ambari-agent(与server在同一个文件夹下)
cat ambari-agent.log
//发现该报错
INFO 2019-03-28 15:46:41,820 NetUtil.py:70 - Connecting to https://hdpmaster:8440/ca
ERROR 2019-03-28 15:46:41,826 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:618)
ERROR 2019-03-28 15:46:41,826 NetUtil.py:97 - SSLError: Failed to connect. Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
WARNING 2019-03-28 15:46:41,827 NetUtil.py:124 - Server at https://hdpmaster:8440 is not reachable, sleeping for 10 seconds...
参考https://community.hortonworks.com/questions/121978/openssl-compatibility.html?childToView=138080#answer-138080
发现是因为centos7自带的python版本高于2.7.5,会诱发该ssl错误
解决方法:1.将机器自带的python版本降低于2.7.5
2.使用如下命令,在ambari-agent.ini的security底下添加和在.cfg的https底下添加(针对所有节点)
vi /etc/ambari-agent/conf/ambari-agent.ini
[security]
force_https_protocol=PROTOCOL_TLSv1_2
vi /etc/python/cert-verification.cfg
[https]
verify=disable