操作系统:linux14.04
hadoop:hadoop2.2.0
问题描述:最近利用MRv2.0(YARN)进行数据挖掘,作业成功完成后,提示jobhistory server拒绝链接,如下所示:
2015-01-04 18:32:17INFO:[main] - map 100% reduce 82%
2015-01-04 18:32:21INFO:[main] - map 100% reduce 100%
2015-01-04 18:32:25INFO:[main] - Job job_1420333234620_0006 completed successfully
2015-01-04 18:32:37INFO:[main] - Counters: 48
File System Counters
FILE: Number of bytes read=151965690
FILE: Number of bytes written=328940451
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=27573775513
HDFS: Number of bytes written=297551633
HDFS: Number of read operations=648
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Killed map tasks=6
Launched map tasks=221
Launched reduce tasks=1
Data-local map tasks=112
Rack-local map tasks=109
Total time spent by all maps in occupied slots (ms)=49141351
Total time spent by all reduces in occupied slots (ms)=3734678
Map-Reduce Framework
Map input records=972799356
Map output records=5386844
Map output bytes=297551633
Map output materialized bytes=159468566
Input split bytes=24510
Combine input records=5386844
Combine output records=5386844
Reduce input groups=5379074
Reduce shuffle bytes=159468566
Reduce input records=5386844
Reduce output records=5386844
Spilled Records=10773688
Shuffled Maps =215
Failed Shuffles=0
Merged Map outputs=215
GC time elapsed (ms)=3006431
CPU time spent (ms)=6609050
Physical memory (bytes) snapshot=35531751424
Virtual memory (bytes) snapshot=83810611200
Total committed heap usage (bytes)=33706455040
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
com.PickPoint$Statics
EVENT_IS_ZAIKE=5386844
EVENT_STATUS_IS_ZAIKE=5386844
STATUS_IS_ZAIKE=309698250
File Input Format Counters
Bytes Read=27573751003
File Output Format Counters
Bytes Written=297551633
2015-01-04 18:32:39INFO:[main] - Retrying connect to server: slave1/192.168.1.101:36290. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1 SECONDS)
2015-01-04 18:32:40INFO:[main] - Retrying connect to server: slave1/192.168.1.101:36290. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1 SECONDS)
2015-01-04 18:32:41INFO:[main] - Retrying connect to server: slave1/192.168.1.101:36290. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1 SECONDS)
2015-01-04 18:32:41INFO:[main] - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-01-04 18:32:42INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:43INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:44INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:45INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:46INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:47INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:48INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:49INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:50INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:51INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:51INFO:[main] - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-01-04 18:32:52INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:53INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:54INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:55INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:56INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:57INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:58INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:32:59INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:00INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:01INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:01INFO:[main] - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2015-01-04 18:33:02INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:03INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:04INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:05INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:06INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:07INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:08INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:09INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:10INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:11INFO:[main] - Retrying connect to server: master/192.168.1.100:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2015-01-04 18:33:12ERROR:[main] - PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException: java.net.ConnectException: Call From master/192.168.1.100 to master:10020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Exception in thread "main" java.io.IOException: java.net.ConnectException: Call From master/192.168.1.100 to master:10020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:331)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:416)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:522)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
at org.apache.hadoop.mapreduce.Job.isSuccessful(Job.java:611)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1301)
at com.PickPoint.main(PickPoint.java:111)
Caused by: java.net.ConnectException: Call From master/192.168.1.100 to master:10020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1351)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy10.getJobReport(Unknown Source)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:317)
... 11 more
Caused by: java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
at org.apache.hadoop.ipc.Client.call(Client.java:1318)
... 19 more
Hadoop自带了一个历史服务器,可以通过历史服务器查看已经运行完的Mapreduce作业记录,比如用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息。默认情况下,Hadoop历史服务器是没有启动的,我们可以通过下面的命令来启动Hadoop历史服务器。
$ sbin/mr-jobhistory-daemon.sh start historyserver
mapreduce.jobhistory.address
master:10020
mapreduce.jobhistory.webapp.address
master:19888
这里的master是我的Namenode的主机名。