Hadoop集群环境搭建中一个错误的解决方案

本文转自我的ChinaUnix博客: http://blog.chinaunix.net/u3/107162/showart_2204785.html

2010-03-29 

环境已经搭好并可以启动了,如下:
  1. maohong@maohong-desktop:~/Software/Development/Hadoop/hadoop-0.20.2$ bin/start-all.sh
  2. starting namenode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-namenode-maohong-desktop.out
  3. slave1: starting datanode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-datanode-debian.out
  4. slave2: starting datanode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-datanode-node2.out
  5. master: starting datanode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-datanode-maohong-desktop.out
  6. master: starting secondarynamenode, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-secondarynamenode-maohong-desktop.out
  7. starting jobtracker, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-jobtracker-maohong-desktop.out
  8. slave1: starting tasktracker, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-tasktracker-debian.out
  9. slave2: starting tasktracker, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-tasktracker-node2.out
  10. master: starting tasktracker, logging to /home/maohong/Software/Development/Hadoop/hadoop-0.20.2/bin/../logs/hadoop-maohong-tasktracker-maohong-desktop.out
  11. maohong@maohong-desktop:~/Software/Development/Hadoop/hadoop-0.20.2$ jps
  12. 22565 SecondaryNameNode
  13. 22646 JobTracker
  14. 22342 DataNode
  15. 22907 Jps
  16. 22115 NameNode
  17. 22861 TaskTracker


但是执行wordcount程序到时候出现Error: java.lang.NullPointerException 错误如下:

  1. maohong@maohong-desktop:~/Software/Development/Hadoop/hadoop-0.20.2$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount test-in test-out
  2. 10/03/25 19:40:05 INFO input.FileInputFormat: Total input paths to process : 4
  3. 10/03/25 19:40:05 INFO mapred.JobClient: Running job: job_201003251936_0001
  4. 10/03/25 19:40:06 INFO mapred.JobClient:  map 0% reduce 0%
  5. 10/03/25 19:40:13 INFO mapred.JobClient:  map 50% reduce 0%
  6. 10/03/25 19:40:14 INFO mapred.JobClient:  map 100% reduce 0%
  7. 10/03/25 19:40:21 INFO mapred.JobClient: Task Id : attempt_201003251936_0001_r_000000_0, Status : FAILED
  8. Error: java.lang.NullPointerException
  9.         at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
  10.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
  11.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

  12. 10/03/25 19:40:21 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=attempt_201003251936_0001_r_000000_0&filter=stdout
  13. 10/03/25 19:40:21 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=attempt_201003251936_0001_r_000000_0&filter=stderr
  14. 10/03/25 19:40:27 INFO mapred.JobClient: Task Id : attempt_201003251936_0001_r_000000_1, Status : FAILED
  15. Error: java.lang.NullPointerException
  16.         at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
  17.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
  18.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

  19. 10/03/25 19:40:27 WARN mapred.JobClient: Error reading task outputnode2.1036dhcp
  20. 10/03/25 19:40:27 WARN mapred.JobClient: Error reading task outputnode2.1036dhcp
  21. 10/03/25 19:40:36 INFO mapred.JobClient: Task Id : attempt_201003251936_0001_r_000000_2, Status : FAILED
  22. Error: java.lang.NullPointerException
  23.         at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
  24.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
  25.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

  26. 10/03/25 19:40:45 INFO mapred.JobClient: Job complete: job_201003251936_0001
  27. 10/03/25 19:40:45 INFO mapred.JobClient: Counters: 12
  28. 10/03/25 19:40:45 INFO mapred.JobClient:   Job Counters
  29. 10/03/25 19:40:45 INFO mapred.JobClient:     Launched reduce tasks=4
  30. 10/03/25 19:40:45 INFO mapred.JobClient:     Launched map tasks=4
  31. 10/03/25 19:40:45 INFO mapred.JobClient:     Data-local map tasks=4
  32. 10/03/25 19:40:45 INFO mapred.JobClient:     Failed reduce tasks=1
  33. 10/03/25 19:40:45 INFO mapred.JobClient:   FileSystemCounters
  34. 10/03/25 19:40:45 INFO mapred.JobClient:     HDFS_BYTES_READ=8637
  35. 10/03/25 19:40:45 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=11495
  36. 10/03/25 19:40:45 INFO mapred.JobClient:   Map-Reduce Framework
  37. 10/03/25 19:40:45 INFO mapred.JobClient:     Combine output records=900
  38. 10/03/25 19:40:45 INFO mapred.JobClient:     Map input records=83
  39. 10/03/25 19:40:45 INFO mapred.JobClient:     Spilled Records=900
  40. 10/03/25 19:40:45 INFO mapred.JobClient:     Map output bytes=14697
  41. 10/03/25 19:40:45 INFO mapred.JobClient:     Combine input records=1525
  42. 10/03/25 19:40:45 INFO mapred.JobClient:     Map output records=1525
  43. maohong@maohong-desktop:~/Software/Development/Hadoop/hadoop-0.20.2$


jobtracker的log文件如下:
  1. 2010-03-25 19:40:09,447 INFO org.apache.hadoop.mapred.JobInProgress: Choosing data-local task task_201003251936_0001_m_000003
  2. 2010-03-25 19:40:12,268 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000000_0' has completed task_201003251936_0001_m_000000 successfully.
  3. 2010-03-25 19:40:12,268 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:1  completedMapsInputSize:4275  completedMapsOutputSize:5190
  4. 2010-03-25 19:40:12,271 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000001_0' has completed task_201003251936_0001_m_000001 successfully.
  5. 2010-03-25 19:40:12,271 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:2  completedMapsInputSize:5745  completedMapsOutputSize:7302
  6. 2010-03-25 19:40:12,288 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_r_000000_0' to tip task_201003251936_0001_r_000000, for tracker 'tracker_localhost:localhost/127.0.0.1:38831'
  7. 2010-03-25 19:40:12,522 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000002_0' has completed task_201003251936_0001_m_000002 successfully.
  8. 2010-03-25 19:40:12,522 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:3  completedMapsInputSize:7215  completedMapsOutputSize:9414
  9. 2010-03-25 19:40:12,524 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000003_0' has completed task_201003251936_0001_m_000003 successfully.
  10. 2010-03-25 19:40:12,524 INFO org.apache.hadoop.mapred.ResourceEstimator: completedMapsUpdates:4  completedMapsInputSize:8641  completedMapsOutputSize:11367
  11. 2010-03-25 19:40:18,300 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201003251936_0001_r_000000_0: Error: java.lang.NullPointerException
  12.         at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
  13.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
  14.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

  15. 2010-03-25 19:40:18,301 INFO org.apache.hadoop.mapred.JobTracker: Adding task (cleanup)'attempt_201003251936_0001_r_000000_0' to tip task_201003251936_0001_r_000000, for tracker 'tracker_localhost:localhost/127.0.0.1:38831'
  16. 2010-03-25 19:40:21,307 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_0' from 'tracker_localhost:localhost/127.0.0.1:38831'
  17. 2010-03-25 19:40:21,559 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_r_000000_1' to tip task_201003251936_0001_r_000000, for tracker 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'
  18. 2010-03-25 19:40:24,599 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201003251936_0001_r_000000_1: Error: java.lang.NullPointerException
  19.         at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
  20.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
  21.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

  22. 2010-03-25 19:40:24,600 INFO org.apache.hadoop.mapred.JobTracker: Adding task (cleanup)'attempt_201003251936_0001_r_000000_1' to tip task_201003251936_0001_r_000000, for tracker 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'
  23. 2010-03-25 19:40:27,607 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_1' from 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'
  24. 2010-03-25 19:40:30,201 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_r_000000_2' to tip task_201003251936_0001_r_000000, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  25. 2010-03-25 19:40:33,260 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201003251936_0001_r_000000_2: Error: java.lang.NullPointerException
  26.         at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
  27.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
  28.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

  29. 2010-03-25 19:40:33,261 INFO org.apache.hadoop.mapred.JobTracker: Adding task (cleanup)'attempt_201003251936_0001_r_000000_2' to tip task_201003251936_0001_r_000000, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  30. 2010-03-25 19:40:36,266 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_r_000000_3' to tip task_201003251936_0001_r_000000, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  31. 2010-03-25 19:40:36,266 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_2' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  32. 2010-03-25 19:40:39,270 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201003251936_0001_r_000000_3: Error: java.lang.NullPointerException
  33.         at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
  34.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
  35.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

  36. 2010-03-25 19:40:39,271 INFO org.apache.hadoop.mapred.JobTracker: Adding task (cleanup)'attempt_201003251936_0001_r_000000_3' to tip task_201003251936_0001_r_000000, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  37. 2010-03-25 19:40:42,278 INFO org.apache.hadoop.mapred.TaskInProgress: TaskInProgress task_201003251936_0001_r_000000 has failed 4 times.
  38. 2010-03-25 19:40:42,278 INFO org.apache.hadoop.mapred.JobInProgress: Aborting job job_201003251936_0001
  39. 2010-03-25 19:40:42,279 INFO org.apache.hadoop.mapred.JobInProgress: Killing job 'job_201003251936_0001'
  40. 2010-03-25 19:40:42,279 INFO org.apache.hadoop.mapred.JobTracker: Adding task 'attempt_201003251936_0001_m_000004_0' to tip task_201003251936_0001_m_000004, for tracker 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  41. 2010-03-25 19:40:42,279 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_3' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  42. 2010-03-25 19:40:45,288 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201003251936_0001_m_000004_0' has completed task_201003251936_0001_m_000004 successfully.
  43. 2010-03-25 19:40:45,333 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000004_0' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  44. 2010-03-25 19:40:45,333 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_2' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  45. 2010-03-25 19:40:45,334 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_3' from 'tracker_maohong-desktop:localhost/127.0.0.1:60931'
  46. 2010-03-25 19:40:45,334 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000000_0' from 'tracker_localhost:localhost/127.0.0.1:38831'
  47. 2010-03-25 19:40:45,334 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000001_0' from 'tracker_localhost:localhost/127.0.0.1:38831'
  48. 2010-03-25 19:40:45,334 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000005_0' from 'tracker_localhost:localhost/127.0.0.1:38831'
  49. 2010-03-25 19:40:45,335 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_0' from 'tracker_localhost:localhost/127.0.0.1:38831'
  50. 2010-03-25 19:40:45,694 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000002_0' from 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'
  51. 2010-03-25 19:40:45,694 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_m_000003_0' from 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'
  52. 2010-03-25 19:40:45,694 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'attempt_201003251936_0001_r_000000_1' from 'tracker_node2.1036dhcp:localhost/127.0.0.1:59187'


其中一个tasktracker的log如下,另外两个tasktracker的log也有与此同样的错误:
  1. 2010-03-25 19:40:30,249 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201003251936_0001_r_000000_2
  2. 2010-03-25 19:40:30,249 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201003251936_0001_r_000000_2
  3. 2010-03-25 19:40:30,587 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201003251936_0001_r_1711860611
  4. 2010-03-25 19:40:30,588 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201003251936_0001_r_1711860611 spawned.
  5. 2010-03-25 19:40:31,057 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201003251936_0001_r_1711860611 given task: attempt_201003251936_0001_r_000000_2
  6. 2010-03-25 19:40:31,437 FATAL org.apache.hadoop.mapred.TaskTracker: Task: attempt_201003251936_0001_r_000000_2 - Killed : java.lang.NullPointerException
  7.         at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
  8.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
  9.         at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

问题的原因在于master和slave节点的/etc/hosts文件~~~~~~
/etc/hosts文件中的主机名一定要是机器名,而不是master、slave1、slave2,否则不能正确解析。这就是结症所在!
 
改了之后就解决了,呵呵
 

你可能感兴趣的:(java,apache,jvm,hadoop,Debian)