[问题]hadoop遇到问题总结

hadoop遇到问题总结 - 蜗牛123 - 博客园
http://www.cnblogs.com/itgg168/archive/2012/11/24/2786088.html

问题一

hadoop fs -ls时出现错误如下:

hadoop fs -ls

11/08/31 22:51:39 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s).
Bad connection to FS. command aborted.

解决方案:

  1. 格式化namenode:

hadoop namenode -format

  1. 重新启动hadoop

sh stop-all.sh

sh start-all.sh

  1. 查看后台进程

jps

13508 NameNode
11008 SecondaryNameNode
14393 Jps
11096 JobTracker
此时namenode启动

  1. 运行

hadoop fs -ls

12/01/31 14:04:39 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
12/01/31 14:04:39 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
Found 1 items
drwxr-xr-x - root supergroup 0 2012-01-31 13:57 /user/root/test

问题二# hadoop fs -put ../conf input 时出现错误如下:
12/01/31 16:01:25 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
12/01/31 16:01:25 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
12/01/31 16:01:26 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1
put: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1
12/01/31 16:01:26 ERROR hdfs.DFSClient: Exception closing file /user/root/input/ssl-server.xml.example : java.io.IOException: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1
解决方案:
这个问题是由于没有添加节点的原因,也就是说需要先启动namenode,再启动datanode,然后启动jobtracker和tasktracker。这样就不会存在这个问题了。 目前解决办法是分别启动节点#hadoop-daemon.sh start namenode #$hadoop-daemon.sh start datanode

  1. 重新启动namenode

hadoop-daemon.sh stop namenode

stopping namenode

hadoop-daemon.sh start namenode

starting namenode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-namenode-www.keli.com.out
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

  1. 重新启动datanode

hadoop-daemon.sh stop datanode

stopping datanode

hadoop-daemon.sh start datanode

starting datanode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-datanode-www.keli.com.out
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

  1. 切换到hadoop的bin目录

cd /usr/hadoop-0.21.0/bin/

  1. 浏览hdfs目录
    [root@www bin]# hadoop fs -ls
    12/01/31 16:09:45 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
    12/01/31 16:09:45 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
    Found 4 items
    drwxr-xr-x - root supergroup 0 2012-01-31 16:01 /user/root/input
    drwxr-xr-x - root supergroup 0 2012-01-31 15:24 /user/root/test
    -rw-r--r-- 1 root supergroup 0 2012-01-31 14:37 /user/root/test-in
    drwxr-xr-x - root supergroup 0 2012-01-31 14:32 /user/root/test1

  2. 删除hdfs中的input目录
    [root@www bin]# hadoop fs -rmr input
    12/01/31 16:10:09 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
    12/01/31 16:10:09 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
    Deleted hdfs://m106:9000/user/root/input

  3. 上传数据到hdfs中的input目录
    [root@www bin]# hadoop fs -put ../conf input
    12/01/31 16:10:14 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
    12/01/31 16:10:14 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id

  4. 浏览input目录,检查已上传的数据
    [root@www bin]# hadoop fs -ls input
    12/01/31 16:10:21 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
    12/01/31 16:10:21 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
    Found 16 items
    -rw-r--r-- 1 root supergroup 3426 2012-01-31 16:10 /user/root/input/capacity-scheduler.xml
    -rw-r--r-- 1 root supergroup 1335 2012-01-31 16:10 /user/root/input/configuration.xsl
    -rw-r--r-- 1 root supergroup 757 2012-01-31 16:10 /user/root/input/core-site.xml
    -rw-r--r-- 1 root supergroup 321 2012-01-31 16:10 /user/root/input/fair-scheduler.xml
    -rw-r--r-- 1 root supergroup 2237 2012-01-31 16:10 /user/root/input/hadoop-env.sh
    -rw-r--r-- 1 root supergroup 1650 2012-01-31 16:10 /user/root/input/hadoop-metrics.properties
    -rw-r--r-- 1 root supergroup 4644 2012-01-31 16:10 /user/root/input/hadoop-policy.xml
    -rw-r--r-- 1 root supergroup 252 2012-01-31 16:10 /user/root/input/hdfs-site.xml
    -rw-r--r-- 1 root supergroup 4141 2012-01-31 16:10 /user/root/input/log4j.properties
    -rw-r--r-- 1 root supergroup 2997 2012-01-31 16:10 /user/root/input/mapred-queues.xml
    -rw-r--r-- 1 root supergroup 430 2012-01-31 16:10 /user/root/input/mapred-site.xml
    -rw-r--r-- 1 root supergroup 25 2012-01-31 16:10 /user/root/input/masters
    -rw-r--r-- 1 root supergroup 26 2012-01-31 16:10 /user/root/input/slaves
    -rw-r--r-- 1 root supergroup 1243 2012-01-31 16:10 /user/root/input/ssl-client.xml.example
    -rw-r--r-- 1 root supergroup 1195 2012-01-31 16:10 /user/root/input/ssl-server.xml.example
    -rw-r--r-- 1 root supergroup 250 2012-01-31 16:10 /user/root/input/taskcontroller.cfg
    [root@www bin]#
    问题三Hadoop启动datanode时出现Unrecognized option: -jvm 和 Could not create the Java virtual machine.
    [root@www bin]# hadoop-daemon.sh start datanode
    starting datanode, logging to /usr/hadoop-0.20.203.0/bin/../logs/hadoop-root-datanode-www.keli.com.out
    Unrecognized option: -jvm
    Could not create the Java virtual machine.

解决办法:
在hadoop安装目录/bin/hadoop中有如下一段shell:
CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
if [[ $EUID -eq 0 ]]; then
HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
else
HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
fi

其中的
if [[ $EUID -eq 0 ]]; then
HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
如果 $EUID 为 0,什么意思呢?
有效用户标识号(EUID):该标识号负责标识以什么用户身份来给新创建的进程赋所有权、检查文件的存取权限和检查通过系统调用kill向进程发送软中断信号的许可权限。
在root用户下echo $EUID,echo结果为 0。
ok,在root下会有-jvm选项添加上去,上面说的Unrecognized option: -jvm难道就是这里产生的。

两个想法。一个想法是自己改了这shell代码,去掉里面的-jvm。另外一个想法是既然这里要求 $EUID -eq 0,那别用$EUID不为0的(root用户)用户运行即可。果断试试,换上普通用户根据文档提示做。ok,成功。好奇的再试试第一个想法,其实暂时还是不太想动源码。但是这shell动动也没妨,果断去掉上面的-jvm,直接把上面的if else 结构直接去掉改为

HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS",
同样运行成功。
问题四[root@www bin]# jps
3283 NameNode
2791 SecondaryNameNode
2856 JobTracker
3348 Jps
hadoop没有启动datanode

解决办法:
format之后之前的datanode会有一个ID,这个ID没有删除,所以会拒绝当前Namenode链接和分配。所以需要删除原来的datanode中的hdfs目录。
[root@freepp ~]# rm -rf /hadoopdata/
重启hadoop
[root@www bin]# jps
4132 Jps
3907 NameNode
4056 DataNode
2791 SecondaryNameNode
2856 JobTracker

你可能感兴趣的:([问题]hadoop遇到问题总结)