CDH-5.14.2集群启动异常记录

1、CDH服务启动不了:

CDH-5.14.2集群启动异常记录_第1张图片

命令行查看cloudera-scm-agent服务已挂。

[root@y3 log]# servicecloudera-scm-agent status

cloudera-scm-agent 已死,但 pid 文件仍存

解决办法:

1、查看日志错误信息:

[root@y3 cloudera-scm-agent]# vim /var/log/cloudera-scm-agent/cloudera-scm-agent.log

[02/Jul/2018 02:02:24 +0000] 13960MainThread agent        INFO     SCM Agent Version: 5.14.2

[02/Jul/2018 02:02:24 +0000] 13960MainThread agent        INFO     Agent Protocol Version: 4

[02/Jul/2018 02:02:24 +0000] 13960MainThread agent        INFO     Using Host ID:9f85e66e-98af-4c7a-8dde-9f08e

3a2d90b

"cloudera-scm-agent.log" 1226L,161439C                                                      996,1         81%

[02/Jul/2018 02:04:10 +0000] 14310MainThread _cplogging   INFO     [02/Jul/2018:02:04:10] ENGINE Startedmonito

r thread '_TimeoutMonitor'.

[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging  ERROR    [02/Jul/2018:02:04:10]ENGINE Error

 in HTTP server: shutting down

Traceback (most recentcall last):

  File"/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/CherryPy-3.2.2-py2.6.egg/cherrypy/process/ser

vers.py", line 187,in _start_http_thread

    self.httpserver.start()

  File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/CherryPy-3.2.2-py2.6.egg/cherrypy/wsgiserver/

wsgiserver2.py",line 1825, in start

    raise socket.error(msg)

error: No socket could becreated on ('unknown.servercentral.net', 9000) -- [Errno 99] Cannot assignrequested a

ddress

 

[02/Jul/2018 02:04:10+0000] 14310 HTTPServer Thread-2 _cplogging  INFO     [02/Jul/2018:02:04:10]ENGINE Bus S

TOPPING

[02/Jul/2018 02:04:10+0000] 14310 HTTPServer Thread-2 _cplogging  INFO     [02/Jul/2018:02:04:10]ENGINE HTTP

Servercherrypy._cpwsgi_server.CPWSGIServer(('unknown.servercentral.net', 9000))already shut down

[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging  INFO     [02/Jul/2018:02:04:10]ENGINE Stopp

ed thread '_TimeoutMonitor'.

[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging  INFO     [02/Jul/2018:02:04:10]ENGINE Bus S

TOPPED

[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging  INFO     [02/Jul/2018:02:04:10]ENGINE Bus E

XITING

[02/Jul/2018 02:04:10 +0000] 14310HTTPServer Thread-2 _cplogging  INFO     [02/Jul/2018:02:04:10]ENGINE Bus E

XITED

原因:根据日志分析是服务启动后连接失败。

1、  检查vim/etc/sysconfig/network域名是否填写正确。

2、  检查vim/etc/hosts域名和ip地址配置是否对应。

3、  杀掉supervisord进程, kill -9$(pgrep -f supervisord)

4、  重启cloudera-scm-agent服务,查看启动状态。 service cloudera-scm-agent restart

注:若以上步骤执行后再次报pid存在,

1、进到服务产生pid路径下

cd /var/run

2、找到cloudera-scm-agent.pid文件,并删除

rm -f /var/run/cloudera-scm-agent.pid

3、启动cloudera-scm-agent 服务

servicecloudera-scm-agent start

4、若以上处理后问题依旧,执行下面两条命令;

servicecloudera-scm-agent hard_stop_confirmed

servicecloudera-scm-agent hard_restart_confirmed


首先检查service cloudera-scm-server-db status 是否启动,

若没有启动,执行service cloudera-scm-server-db start 启动




2[root@y3 run]# service cloudera-scm-serverstatus

cloudera-scm-server 已死,但 pid 文件仍存

[root@y3 run]# 

解决办法:

1、查看日志文件:

[root@y3 cloudera-scm-server]# vim /var/log/cloudera-scm-server/cloudera-scm-server.out

log4j:ERROR setFile(null,true) call failed.

java.io.FileNotFoundException:/var/log/cloudera-scm-server/cloudera-scm-server.log (权限不够)

       at java.io.FileOutputStream.open(Native Method)

       at java.io.FileOutputStream.(FileOutputStream.java:221)

       at java.io.FileOutputStream.(FileOutputStream.java:142)

       at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)

       atorg.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207)

       at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)

       at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)

       atorg.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)

       atorg.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)

       at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809)

       atorg.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735)

       atorg.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615)

       at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502)

       atorg.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547)

       atorg.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)

       at org.apache.log4j.LogManager.(LogManager.java:127)

       atorg.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73)

       at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242)

       at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:254)

       at com.cloudera.server.cmf.Main.(Main.java:158)

log4j:ERROR setFile(null,true) call failed.

java.io.FileNotFoundException:/var/log/cloudera-scm-server/cmf-server-perf.log (权限不够)

       at java.io.FileOutputStream.open(Native Method)

       at java.io.FileOutputStream.(FileOutputStream.java:221)

       at java.io.FileOutputStream.(FileOutputStream.java:142)

       at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)

       at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207)

       at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)

       atorg.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)

       atorg.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)

       atorg.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)

       at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809)

       atorg.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735)

       atorg.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:639)

       at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:504)

       atorg.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547)

       atorg.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)

       at org.apache.log4j.LogManager.(LogManager.java:127)

       atorg.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73)

       at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242)

       at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:254)

        atcom.cloudera.server.cmf.Main.(Main.java:158)

根据日志分析,发现服务启动创建日志文件权限不足导致。

1、查看日志目录/var/log权限:

[root@y3 log]# ll

总用量 72

drwxr-xr-x 2 root         root          4096 7月   2 02:17 cloudera-scm-agent

drwxrwxr-x 3 cloudera-scm cloudera-scm  4096 7月   2 02:17 cloudera-scm-alertpublisher

drwxrwxr-x 3 cloudera-scm cloudera-scm  4096 7月   2 02:17 cloudera-scm-eventserver

drwxrwxr-x 3 cloudera-scm cloudera-scm  4096 7月   2 02:17 cloudera-scm-firehose

drwxrwxr-x 3 cloudera-scm cloudera-scm  4096 7月   2 02:17 cloudera-scm-headlamp

drwxr-xr-x 2 cloudera-scm cloudera-scm  4096 7月   2 02:48 cloudera-scm-server

drwxrwxr-x 4 hdfs         hadoop        4096 7月   2 02:17 hadoop-hdfs

drwxrwxr-x 3 mapred       hadoop        4096 7月   2 02:17 hadoop-mapreduce

drwxrwxr-x 3 yarn         hadoop        4096 7月   2 02:17 hadoop-yarn

drwxr-xr-x 4 hbase        hbase         4096 7月   2 02:17 hbase

drwxr-xr-x 8 hive         hive          4096 7月   2 02:18 hive

drwxr-xr-x 2 root         root          4096 7月   1 23:31 httpd

-rw------- 1 root         root         11807 7月   2 02:50 maillog

drwxr-xr-x 4 solr         solr          4096 7月   2 02:17 solr

-rw------- 1 root         root           994 7月   2 02:33 yum.log

drwxr-xr-x 3 zookeeper    zookeeper     4096 7月   2 02:17 zookeeper

[root@y3 log]#

2、Cloudera开头的都是属于cloudera-scm用户及组,如果不是,手动修改为cloudera-scm用户及组。

chown cloudera-scm:cloudera-scmcloudera-scm-server

chown cloudera-scm:cloudera-scmcloudera-scm-server/*

3、创建日志文件。

命令参考:https://blog.csdn.net/u013262689/article/details/69481121

  su-s /bin/bash cloudera-scm -c "touch /var/log/cloudera-scm-server/db.log;/usr/share/cmf/bin/initialize_embedded_db.sh/var/lib/cloudera-scm-server-db/data /var/log/cloudera-scm-server/db.log"

 

  su-s /bin/bash cloudera-scm -c "pg_ctl start -w -D/var/lib/cloudera-scm-server-db/data -l/var/log/cloudera-scm-server/db.log"

 

4、启动cloudera服务

[root@y3 log]#

[root@y3 log]# servicecloudera-scm-agent status

cloudera-scm-agent (pid  14528) 正在运行...

[root@y3 log]# servicecloudera-scm-server-db start

Database is already running. Please stop itfirst., giving up

[root@y3 log]#

[root@y3 log]# servicecloudera-scm-server-db status

pg_ctl: 正在运行服务器进程(PID: 26780)

/usr/bin/postgres "-D""/var/lib/cloudera-scm-server-db/data"

[root@y3 log]#

[root@y3 log]# servicecloudera-scm-server start

Starting cloudera-scm-server:                              [确定]

[root@y3 log]#

[root@y3 log]# servicecloudera-scm-server status

cloudera-scm-server (pid  27564) 正在运行...

[root@y3 log]#



3、Postgresql首次启动需要初始化:

[root@y3 cloudera-scm-server]# servicepostgresql restart

停止 postgresql 服务:                                     [确定]

 

/var/lib/pgsql/data ismissing. Use "service postgresql initdb" to initialize the clusterfirst.

                                                          [失败]

[root@y3 cloudera-scm-server]# servicepostgresql initdb

正在初始化数据库:                                         [确定]

[root@y3 cloudera-scm-server]# servicepostgresql restart

停止 postgresql 服务:                                    [确定]

启动 postgresql 服务:                                    [确定]

[root@y3 cloudera-scm-server]#




4、Hadoop异常:

.RemoteException(org.apache.hadoop.ipc.StandbyException):Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error

 

查阅资料,需要在hadoop的core-site.xml中加入如下配置:


     fs.hdfs.impl
    org.apache.hadoop.hdfs.DistributedFileSystem
     The FileSystem for hdfs:uris.




你可能感兴趣的:(hadoop,大数据-hadoop)