转这个目的,是因为该贴子中调优思路不错,值得学习
搜索推荐有一个job,1000多个map,200个reduce,运行到最后只剩一个reduce(10.39.6.130上)的时候,出现以下异常,导致job失败:
- 2014-12-04 15:49:04,297 INFO [main] org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 12 segments left of total size: 11503294914 bytes
- 2014-12-04 15:49:04,314 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
- 2014-12-04 15:49:04,394 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.lzo_deflate]
- 2014-12-04 16:02:26,889 WARN [ResponseProcessor for block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086] org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086
- java.io.IOException: Bad response ERROR for block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086 from datanode 10.39.5.193:50010
- at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:819)
- 2014-12-04 16:02:26,889 WARN [ResponseProcessor for block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223] org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223
- java.io.IOException: Bad response ERROR for block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223 from datanode 10.39.1.90:50010
- at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:819)
- 2014-12-04 16:02:26,891 WARN [DataStreamer for file /dw_ext/recmd/mds6/mds_filter_relation_10/20141203/_temporary/1/_temporary/attempt_1415948652989_195149_r_000158_3/user-r-00158 block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086] org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086 in pipeline 10.39.6.130:50010, 10.39.5.185:50010, 10.39.5.193:50010: bad datanode 10.39.5.193:50010
- 2014-12-04 16:02:26,891 WARN [DataStreamer for file /dw_ext/recmd/mds6/mds_filter_relation_10/20141203/_temporary/1/_temporary/attempt_1415948652989_195149_r_000158_3/exposure-r-00158 block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223] org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223 in pipeline 10.39.6.130:50010, 10.39.1.89:50010, 10.39.1.90:50010: bad datanode 10.39.1.90:50010
- java.io.EOFException: Premature EOF: no length prefix available
- at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1987)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
- at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:796)
- 2014-12-04 16:05:23,743 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
- at java.util.Arrays.copyOf(Arrays.java:2734)
- at java.util.Vector.ensureCapacityHelper(Vector.java:226)
- at java.util.Vector.add(Vector.java:728)
- at rec.CommonUtil.pack_Treeset(CommonUtil.java:395)
- at rec.ConvertExposure10$MyReducer.collect_exposure(ConvertExposure10.java:259)
- at rec.ConvertExposure10$MyReducer.reduce(ConvertExposure10.java:329)
- at rec.ConvertExposure10$MyReducer.reduce(ConvertExposure10.java:234)
- at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
- at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
- at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
- at java.security.AccessController.doPrivileged(Native Method)
- at javax.security.auth.Subject.doAs(Subject.java:396)
- at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1550)
- at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
从异常上看,首先是reduce在往hdfs写数据时,发现建pipeline时,没有收到pipeline上最后一个节点的回应:
- 2014-12-04 16:02:26,889 WARN [ResponseProcessor for block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086] org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086
- java.io.IOException: Bad response ERROR for block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086 from datanode 10.39.5.193:50010
- at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:819)
- 2014-12-04 16:02:26,889 WARN [ResponseProcessor for block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223] org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223
- java.io.IOException: Bad response ERROR for block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223 from datanode 10.39.1.90:50010
- at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:819)
- 2014-12-04 16:02:26,891 WARN [DataStreamer for file /dw_ext/recmd/mds6/mds_filter_relation_10/20141203/_temporary/1/_temporary/attempt_1415948652989_195149_r_000158_3/user-r-00158 block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086] org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086 in pipeline 10.39.6.130:50010, 10.39.5.185:50010, 10.39.5.193:50010: bad datanode 10.39.5.193:50010
- 2014-12-04 16:02:26,891 WARN [DataStreamer for file /dw_ext/recmd/mds6/mds_filter_relation_10/20141203/_temporary/1/_temporary/attempt_1415948652989_195149_r_000158_3/exposure-r-00158 block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223] org.apache.hadoop.hdfs.DFSClient: Error Recovery for block BP-1386326728-10.39.2.131-1382089338395:blk_1394153869_320473223 in pipeline 10.39.6.130:50010, 10.39.1.89:50010, 10.39.1.90:50010: bad datanode 10.39.1.90:50010
- java.io.EOFException: Premature EOF: no length prefix available
- at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1987)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
- at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:796)
这里以写block:blk_1394149732_320469086为例,pipeline[10.39.6.130:50010, 10.39.5.185:50010, 10.39.5.193:50010]上面的最后一个DN是10.39.5.193,到10.39.5.193查看该block的日志信息:
2014-12-04 16:00:57,424 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086
- java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.39.5.193:50010 remote=/10.39.5.185:58225]
- at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
- at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
- at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
- at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
- at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
- at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- at java.io.DataInputStream.read(DataInputStream.java:132)
- at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
- at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:446)
- at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:702)
- at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:739)
- at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
- at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
- at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
- at java.lang.Thread.run(Thread.java:662)
- 2014-12-04 16:00:57,424 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086, type=LAST_IN_PIPELINE, downstreams=0:[]: Thread is interrupted.
10.39.5.193上面日志显示,在读取pipeline上一个节点10.39.5.185的Packet时,一直读取不到,直到10分钟超时:
- java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.39.5.193:50010 remote=/10.39.5.185:58225]
那我们来看以下pipeline上第二个节点10.39.5.185,dn日志如下:
2014-12-04 16:00:57,988 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086, type=HAS_DOWNSTREAM_IN_PIPELINE
- java.io.EOFException: Premature EOF: no length prefix available
- at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1987)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
- at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1083)
- at java.lang.Thread.run(Thread.java:662)
- 2014-12-04 16:00:58,008 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086
- java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.39.5.185:50010 remote=/10.39.6.130:59083]
- at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
- at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
- at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
- at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
- at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
- at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- at java.io.DataInputStream.read(DataInputStream.java:132)
- at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
- at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:446)
- at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:702)
- at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:739)
- at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
- at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
- at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
- at java.lang.Thread.run(Thread.java:662)
- 2014-12-04 16:00:58,009 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086, type=HAS_DOWNSTREAM_IN_PIPELINE: Thread is interrupted.
和10.39.5.193日志类似,也是在等待读取pipeline的第一个节点10.39.6.130的Packet时超时:
- java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.39.5.185:50010 remote=/10.39.6.130:59083]
这样说来,问题出在10.39.6.130上,也即reduce任务运行的节点上,该节点DN日志如下:
- 2014-12-04 16:00:59,987 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1386326728-10.39.2.131-1382089338395:blk_1394149732_320469086
- java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.39.6.130:50010 remote=/10.39.6.130:45259]
- at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
- at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
- at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
- at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
- at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
- at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- at java.io.DataInputStream.read(DataInputStream.java:132)
- at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
- at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
- at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:446)
- at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:702)
- at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:739)
- at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
- at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
- at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
- at java.lang.Thread.run(Thread.java:662)
但是根据日志信息,10.39.6.130的DN也是在等待Packet,但是一直等到超时也没等到:
java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/10.39.6.130:50010 remote=/10.39.6.130:45259]
看来不是10.39.6.130节点DN的问题,如果pipeline上面三个DN都没有问题,那问题肯定就出在dfsclient端了,也就是reduce任务在往hdfs写数据的时候根本就没有写得出去,在dfsclient上面就给堵住了,接下来查看dfsclient,也就是reduce任务进程的执行情况:
在10.39.6.130上,根据任务id:attempt_1415948652989_195149_r_000158_3 找到进程id 31050,查看内存使用情况:
- jstat -gcutil 31050 1000:
- S0 S1 E O P YGC YGCT FGC FGCT GCT
- 68.95 0.00 0.00 92.98 66.32 111 16.825 10 25.419 42.244
- 68.95 0.00 0.00 92.98 66.32 111 16.825 10 25.419 42.244
- 68.95 0.00 0.00 92.98 66.32 111 16.825 10 25.419 42.244
- 68.95 0.00 0.00 92.98 66.32 111 16.825 10 25.419 42.244
- 68.95 0.00 0.00 92.98 66.32 111 16.825 10 25.419 42.244
- 68.95 0.00 0.00 92.98 66.32 111 16.825 10 25.419 42.244
- 68.95 0.00 26.75 100.00 66.32 111 16.825 10 25.419 42.244
- 0.00 0.00 31.85 100.00 68.16 111 16.825 10 44.767 61.591
- 0.00 0.00 35.37 100.00 68.16 111 16.825 10 44.767 61.591
- 0.00 0.00 40.64 100.00 68.16 111 16.825 10 44.767 61.591
- 0.00 0.00 45.35 100.00 68.16 111 16.825 10 44.767 61.591
- 0.00 0.00 48.87 100.00 68.16 111 16.825 10 44.767 61.591
- 0.00 0.00 54.14 100.00 68.16 111 16.825 10 44.767 61.591
- 0.00 0.00 58.85 100.00 68.16 111 16.825 10 44.767 61.591
果然,JVM老年代被占满,不断执行FGC,直接stop the world,导致JVM没法对外提供服务,导致dfsclient挂起,一直不能往pipeline上面的节点写Packet,直到socket超时。
根据reduce任务最后的日志,也从侧面验证了这种观点:
Error running child : java.lang.OutOfMemoryError: Java heap space
既然是OOM导致的job失败,那是什么对象导致的内存泄露呢:
执行:
- jmap -histo:live 31050 > jmap.log
- cat jmap.log :
- num #instances #bytes class name
- ----------------------------------------------
- 1: 71824177 2872967080 java.util.TreeMap$Entry
- 2: 71822939 1723750536 java.lang.Long
- 3: 10684 24777776 [B
- 4: 47174 6425152 <methodKlass>
- 5: 47174 6408120 <constMethodKlass>
- 6: 3712 4429776 <constantPoolKlass>
- 7: 66100 3979224 <symbolKlass>
- 8: 3712 2938192 <instanceKlassKlass>
- 9: 3125 2562728 <constantPoolCacheKlass>
- 10: 3477 1267752 [I
- 11: 12923 1180224 [C
- 12: 1794 772488 <methodDataKlass>
- 13: 13379 428128 java.lang.String
- 14: 4034 419536 java.lang.Class
- 15: 6234 410312 [S
- 16: 6409 352576 [[I
- 17: 7567 242144 java.util.HashMap$Entry
- 18: 293 171112 <objArrayKlassKlass>
- 19: 4655 148960 java.util.Hashtable$Entry
- 20: 1535 135080 java.lang.reflect.Method
- 21: 842 121696 [Ljava.util.HashMap$Entry;
果然啊,reduce代码中使用了TreeMap,往里面放置了大量对象,导致出现OOM,TreeMap的Entry就站用了2.8G内存,而我们reduce设置的内存也就1.5G。
总结:对该job出现的异常,一般在以下几种情况下发生:
1、写数据块的DN出现问题,不能写入,就像之前出现的DN由于本地读问题导致xceivers(每个DN用于并发数据传输处理最大线程数)达到4096,耗尽了所有的线程,没法对新发起的输出写入做出相应。
2、网络出现异常,DN节点进或出的带宽被耗尽,导致数据写不出去或者写不进来,这种情况可以观察ganglia看节点带宽使用情况,这种情况一般比较少。当该job出现问题的时候,也怀疑过是带宽被耗尽的问题,查看了一下相关节点ganglia带宽使用情况,最大in/out 85M/s左右,最后排除是带宽问题。
3、dfsclient出现问题,长时间没反应,导致已经发起的socket超时。由于dfsclient情况比较复杂,出现问题的情况比较多,比如本问就是因为reduce出现内存溢出,jvm不断进行FGC,导致dfsclient挂起,最终socket出现超时。