1、CDH服务启动不了:
命令行查看cloudera-scm-agent服务已挂。
[root@y3 log]# servicecloudera-scm-agent status
cloudera-scm-agent 已死,但 pid 文件仍存
解决办法:
1、查看日志错误信息:
[root@y3 cloudera-scm-agent]# vim /var/log/cloudera-scm-agent/cloudera-scm-agent.log
[02/Jul/2018 02:02:24 +0000] 13960MainThread agent INFO SCM Agent Version: 5.14.2
[02/Jul/2018 02:02:24 +0000] 13960MainThread agent INFO Agent Protocol Version: 4
[02/Jul/2018 02:02:24 +0000] 13960MainThread agent INFO Using Host ID:9f85e66e-98af-4c7a-8dde-9f08e
3a2d90b
"cloudera-scm-agent.log" 1226L,161439C 996,1 81%
[02/Jul/2018 02:04:10 +0000] 14310MainThread _cplogging INFO [02/Jul/2018:02:04:10] ENGINE Startedmonito
r thread '_TimeoutMonitor'.
[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging ERROR [02/Jul/2018:02:04:10]ENGINE Error
in HTTP server: shutting down
Traceback (most recentcall last):
File"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/CherryPy-3.2.2-py2.6.egg/cherrypy/process/ser
vers.py", line 187,in _start_http_thread
self.httpserver.start()
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/CherryPy-3.2.2-py2.6.egg/cherrypy/wsgiserver/
wsgiserver2.py",line 1825, in start
raise socket.error(msg)
error: No socket could becreated on ('unknown.servercentral.net', 9000) -- [Errno 99] Cannot assignrequested a
ddress
[02/Jul/2018 02:04:10+0000] 14310 HTTPServer Thread-2 _cplogging INFO [02/Jul/2018:02:04:10]ENGINE Bus S
TOPPING
[02/Jul/2018 02:04:10+0000] 14310 HTTPServer Thread-2 _cplogging INFO [02/Jul/2018:02:04:10]ENGINE HTTP
Servercherrypy._cpwsgi_server.CPWSGIServer(('unknown.servercentral.net', 9000))already shut down
[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging INFO [02/Jul/2018:02:04:10]ENGINE Stopp
ed thread '_TimeoutMonitor'.
[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging INFO [02/Jul/2018:02:04:10]ENGINE Bus S
TOPPED
[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging INFO [02/Jul/2018:02:04:10]ENGINE Bus E
XITING
[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging INFO [02/Jul/2018:02:04:10]ENGINE Bus E
XITED
原因:根据日志分析是服务启动后连接失败。
1、 检查vim/etc/sysconfig/network域名是否填写正确。
2、 检查vim/etc/hosts域名和ip地址配置是否对应。
3、 杀掉supervisord进程, kill -9$(pgrep -f supervisord)
4、 重启cloudera-scm-agent服务,查看启动状态。 service cloudera-scm-agent restart
注:若以上步骤执行后再次报pid存在,
1、进到服务产生pid路径下
cd /var/run
2、找到cloudera-scm-agent.pid文件,并删除
rm -f /var/run/cloudera-scm-agent.pid
3、启动cloudera-scm-agent 服务
servicecloudera-scm-agent start
4、若以上处理后问题依旧,执行下面两条命令;
servicecloudera-scm-agent hard_stop_confirmed
servicecloudera-scm-agent hard_restart_confirmed
首先检查service cloudera-scm-server-db status 是否启动,
若没有启动,执行service cloudera-scm-server-db start 启动
2、[root@y3 run]# service cloudera-scm-serverstatus
cloudera-scm-server 已死,但 pid 文件仍存
[root@y3 run]#
解决办法:
1、查看日志文件:
[root@y3 cloudera-scm-server]# vim /var/log/cloudera-scm-server/cloudera-scm-server.out
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException:/var/log/cloudera-scm-server/cloudera-scm-server.log (权限不够)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.
at java.io.FileOutputStream.
at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
atorg.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
atorg.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
atorg.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809)
atorg.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735)
atorg.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502)
atorg.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547)
atorg.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)
at org.apache.log4j.LogManager.
atorg.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:254)
at com.cloudera.server.cmf.Main.
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException:/var/log/cloudera-scm-server/cmf-server-perf.log (权限不够)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.
at java.io.FileOutputStream.
at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
atorg.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
atorg.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
atorg.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809)
atorg.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735)
atorg.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:639)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:504)
atorg.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547)
atorg.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)
at org.apache.log4j.LogManager.
atorg.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:254)
atcom.cloudera.server.cmf.Main.根据日志分析,发现服务启动创建日志文件权限不足导致。
1、查看日志目录/var/log权限:
[root@y3 log]# ll
总用量 72
drwxr-xr-x 2 root root 4096 7月 2 02:17 cloudera-scm-agent
drwxrwxr-x 3 cloudera-scm cloudera-scm 4096 7月 2 02:17 cloudera-scm-alertpublisher
drwxrwxr-x 3 cloudera-scm cloudera-scm 4096 7月 2 02:17 cloudera-scm-eventserver
drwxrwxr-x 3 cloudera-scm cloudera-scm 4096 7月 2 02:17 cloudera-scm-firehose
drwxrwxr-x 3 cloudera-scm cloudera-scm 4096 7月 2 02:17 cloudera-scm-headlamp
drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 7月 2 02:48 cloudera-scm-server
drwxrwxr-x 4 hdfs hadoop 4096 7月 2 02:17 hadoop-hdfs
drwxrwxr-x 3 mapred hadoop 4096 7月 2 02:17 hadoop-mapreduce
drwxrwxr-x 3 yarn hadoop 4096 7月 2 02:17 hadoop-yarn
drwxr-xr-x 4 hbase hbase 4096 7月 2 02:17 hbase
drwxr-xr-x 8 hive hive 4096 7月 2 02:18 hive
drwxr-xr-x 2 root root 4096 7月 1 23:31 httpd
-rw------- 1 root root 11807 7月 2 02:50 maillog
drwxr-xr-x 4 solr solr 4096 7月 2 02:17 solr
-rw------- 1 root root 994 7月 2 02:33 yum.log
drwxr-xr-x 3 zookeeper zookeeper 4096 7月 2 02:17 zookeeper
[root@y3 log]#
2、Cloudera开头的都是属于cloudera-scm用户及组,如果不是,手动修改为cloudera-scm用户及组。
chown cloudera-scm:cloudera-scmcloudera-scm-server
chown cloudera-scm:cloudera-scmcloudera-scm-server/*
3、创建日志文件。
命令参考:https://blog.csdn.net/u013262689/article/details/69481121
su-s /bin/bash cloudera-scm -c "touch /var/log/cloudera-scm-server/db.log;/usr/share/cmf/bin/initialize_embedded_db.sh/var/lib/cloudera-scm-server-db/data /var/log/cloudera-scm-server/db.log"
su-s /bin/bash cloudera-scm -c "pg_ctl start -w -D/var/lib/cloudera-scm-server-db/data -l/var/log/cloudera-scm-server/db.log"
4、启动cloudera服务
[root@y3 log]#
[root@y3 log]# servicecloudera-scm-agent status
cloudera-scm-agent (pid 14528) 正在运行...
[root@y3 log]# servicecloudera-scm-server-db start
Database is already running. Please stop itfirst., giving up
[root@y3 log]#
[root@y3 log]# servicecloudera-scm-server-db status
pg_ctl: 正在运行服务器进程(PID: 26780)
/usr/bin/postgres "-D""/var/lib/cloudera-scm-server-db/data"
[root@y3 log]#
[root@y3 log]# servicecloudera-scm-server start
Starting cloudera-scm-server: [确定]
[root@y3 log]#
[root@y3 log]# servicecloudera-scm-server status
cloudera-scm-server (pid 27564) 正在运行...
[root@y3 log]#
3、Postgresql首次启动需要初始化:
[root@y3 cloudera-scm-server]# servicepostgresql restart
停止 postgresql 服务: [确定]
/var/lib/pgsql/data ismissing. Use "service postgresql initdb" to initialize the clusterfirst.
[失败]
[root@y3 cloudera-scm-server]# servicepostgresql initdb
正在初始化数据库: [确定]
[root@y3 cloudera-scm-server]# servicepostgresql restart
停止 postgresql 服务: [确定]
启动 postgresql 服务: [确定]
[root@y3 cloudera-scm-server]#
4、Hadoop异常:
.RemoteException(org.apache.hadoop.ipc.StandbyException):Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
查阅资料,需要在hadoop的core-site.xml中加入如下配置: