近期操作ambari重启ResourceManager的App Timeline Server 服务无法正常启动,ambari界面报错如下:
4 - File['/var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid'] {'action': ['delete'], 'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid` >/dev/null 2>&1'}
2016-04-19 19:02:42,747 - Execute['ulimit -c unlimited; export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop-yarn/sbin/yarn-daemon.sh --config /etc/hadoop/conf start timelineserver'] {'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid` >/dev/null 2>&1', 'user': 'yarn'}
2016-04-19 19:02:43,833 - Execute['ls /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid` >/dev/null 2>&1'] {'initial_wait': 5, 'not_if': 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid` >/dev/null 2>&1', 'user': 'yarn'}
2016-04-19 19:02:48,936 - Error while executing command 'start':
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 123, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/scripts/application_timeline_server.py", line 42, in start
service('timelineserver', action='start')
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/scripts/service.py", line 59, in service
initial_wait=5
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 241, in action_run
raise ex
Fail: Execution of 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid >/dev/null 2>&1 && ps `cat /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid` >/dev/null 2>&1' returned 1.
尝试删除 /var/run/hadoop-yarn/yarn/yarn-yarn-timelineserver.pid文件,再次启动,无果。
到服务器上查看日志,发下如下内容:
vi /var/log/hadoop-yarn/yarn/yarn-yarn-timelineserver.log (根据自己的文件名称)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:339)
at org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.serviceInit(LeveldbTimelineStore.java:183)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:88)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:145)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:155)
Caused by: java.io.FileNotFoundException: /etc/hadoop/conf.empty/hdfs-site.xml (Permission denied)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at java.io.FileInputStream.<init>(FileInputStream.java:101)
at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
at java.net.URL.openStream(URL.java:1037)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2227)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2298)
... 19 more
2016-04-19 19:38:41,373 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status -1
2016-04-19 19:38:41,375 INFO applicationhistoryservice.ApplicationHistoryServer (StringUtils.java:run(640)) - SHUTDOWN_MSG:
什么?hdfs-site.xml没有权限
查看下权限ll hdfs-site.xml
-rw------- 1 root root 7368 Apr 12 16:06 hdfs-site.xml
怎么权限和用户组都不对呢
正常应该是这样的:
-rw-r--r-- 1 hdfs hadoop 7368 Apr 12 16:06 hdfs-site.xml
修改权限和属主,再次启动服务,正常了。
问了下同事因为之前对这个文件进行了操作修改,复制的时候使用了root用户,对于ambari管理的hadoop建议不要在源码中修改文件,统一在ambari上修改,避免出现不必要的问题。