目录 1
1. 前言 3
2. 特性介绍 3
3. 部署 5
3.1. 机器列表 5
3.2. 主机名 5
3.2.1. 临时修改主机名 6
3.2.2. 永久修改主机名 6
3.3. 免密码登录范围 7
3.4. 修改最大可打开文件数 7
3.5. OOM相关:vm.overcommit_memory 7
4. 约定 7
4.1. 安装目录约定 7
4.2. 服务端口约定 8
4.3. 各模块RPC和HTTP端口 9
5. 工作详单 9
6. JDK安装 9
6.1. 下载安装包 9
6.2. 安装步骤 10
7. 免密码ssh2登录 10
8. Hadoop安装和配置 11
8.1. 下载安装包 11
8.2. 安装和环境变量配置 12
8.3. 修改hadoop-env.sh 12
8.4. 修改/etc/hosts 13
8.5. 修改slaves 14
8.6. 准备好各配置文件 14
8.7. 修改hdfs-site.xml 15
8.8. 修改core-site.xml 18
8.8.1. dfs.namenode.rpc-address 18
8.9. 修改mapred-site.xml 18
8.10. 修改yarn-site.xml 19
9. 启动顺序 20
10. 启动HDFS 21
10.1. 启动好zookeeper 21
10.2. 创建主备切换命名空间 21
10.3. 启动所有JournalNode 21
10.4. 格式化NameNode 21
10.5. 初始化JournalNode 22
10.6. 启动主NameNode 22
10.7. 启动备NameNode 23
10.8. 启动主备切换进程 23
10.9. 启动所有DataNode 23
10.10. 检查启动是否成功 23
10.10.1. DataNode 24
10.10.2. NameNode 24
10.11. 执行HDFS命令 24
10.11.1. 查看DataNode是否正常启动 24
10.11.2. 查看NameNode的主备状态 24
10.11.3. hdfs dfs ls 25
10.11.4. hdfs dfs -put 25
10.11.5. hdfs dfs -rm 25
10.11.6. 人工切换主备NameNode 25
10.11.7. HDFS只允许有一主一备两个NameNode 25
10.11.8. 存储均衡start-balancer.sh 26
10.11.9. 查看文件分布在哪些节点 27
10.11.10. 关闭安全模式 28
10.11.11. 删除missing blocks 28
11. 扩容和下线 28
11.1. 新增JournalNode 28
11.2. 新NameNode如何加入? 29
11.3. 扩容DataNode 30
11.4. 下线DataNode 30
11.5. 强制DataNode上报块信息 32
12. 启动YARN 32
12.1. 启动YARN 32
12.2. 执行YARN命令 33
12.2.1. yarn node -list 33
12.2.2. yarn node -status 33
12.2.3. yarn rmadmin -getServiceState rm1 34
12.2.4. yarn rmadmin -transitionToStandby rm1 34
13. 运行MapReduce程序 34
14. HDFS权限配置 35
14.1. hdfs-site.xml 35
14.2. core-site.xml 36
15. C++客户端编程 36
15.1. 示例代码 36
15.2. 运行示例 37
16. fsImage 37
17. 常见错误 39
当前版本的Hadoop已解决了hdfs、yarn和hbase等单点,并支持自动的主备切换。
本文的目的是为当前最新版本的Hadoop 2.8.0提供最为详细的安装说明,以帮助减少安装过程中遇到的困难,并对一些错误原因进行说明,hdfs配置使用基于QJM(Quorum Journal Manager)的HA。本文的安装只涉及了hadoop-common、hadoop-hdfs、hadoop-mapreduce和hadoop-yarn,并不包含HBase、Hive和Pig等。
NameNode存储了一个文件有哪些块,但是它并不存储这些块在哪些DataNode上,DataNode会上报有哪些块。如果在NameNode的Web上看到“missing”,是因为没有任何的DataNode上报该块,也就造成的丢失。
版本 |
发版本日期 |
新特性 |
3.0.0 |
|
支持多NameNode |
2.8.0 |
2016/1/25 |
|
2.7.1 |
2015/7/6 |
|
2.7.0 |
2015/4/21 |
1) 不再支持JDK6,须JDK 7+ 2) 支持文件截取(truncate) 3) 支持为每种存储类型设置配额 4) 支持文件变长块(之前一直为固定块大小,默认为64M) 5) 支持Windows Azure Storage 6) YARN认证可插拔 7) 自动共享,全局缓存YARN本地化资源(测试阶段) 8) 限制一个作业运行的Map/Reduce任务 9) 加快大量输出文件时大型作业的FileOutputCommitter速度 |
2.6.4 |
2016/2/11 |
|
2.6.3 |
2015/12/17 |
|
2.6.2 |
2015/10/28 |
|
2.6.1 |
2015/9/23 |
|
2.6.0 |
2014/11/18 |
1) YARN支持长时间运行的服务 2) YARN支持升级回滚 3) YARN支持应用运行在Docker容器中 |
2.5.2 |
2014/11/19 |
|
2.5.1 |
2014/9/12 |
|
2.5.0 |
2014/8/11 |
|
2.4.1 |
2014/6/30 |
|
2.4.0 |
2014/4/7 |
1) HDFS升级回滚 2) HDFS支持完整的https 3) YARN ResourceManager支持自动故障切换 |
2.2.0 |
2013/10/15 |
1) HDFS Federation 2) HDFS Snapshots |
2.1.0-beta |
2013/8/25 |
1) HDFS快照 2) 支持Windows |
2.0.3-alpha |
2013/2/14 |
1) 基于QJM的NameNode HA |
2.0.0-alpha |
2012/5/23 |
1) 人工切换的NameNode HA 2) HDFS Federation |
1.0.0 |
2011/12/27 |
|
0.23.11 |
2014/6/27 |
|
0.23.10 |
2013/12/11 |
|
0.22.0 |
2011/12/10 |
|
0.23.0 |
2011/11/17 |
|
0.20.205.0 |
2011/10/17 |
|
0.20.204.0 |
2011/9/5 |
|
0.20.203.0 |
2011/5/11 |
|
0.21.0 |
2010/8/23 |
|
0.20.2 |
2010/2/26 |
|
0.20.1 |
2009/9/14 |
|
0.19.2 |
2009/7/23 |
|
0.20.0 |
2009/4/22 |
|
0.19.1 |
2009/2/24 |
|
0.18.3 |
2009/1/29 |
|
0.19.0 |
2008/11/21 |
|
0.18.2 |
2008/11/3 |
|
0.18.1 |
2008/9/17 |
|
0.18.0 |
2008/8/22 |
|
0.17.2 |
2008/8/19 |
|
0.17.1 |
2008/6/23 |
|
0.17.0 |
2008/5/20 |
|
0.16.4 |
2008/5/5 |
|
0.16.3 |
2008/4/16 |
|
0.16.2 |
2008/4/2 |
|
0.16.1 |
2008/3/13 |
|
0.16.0 |
2008/2/7 |
|
0.15.3 |
2008/1/18 |
|
0.15.2 |
2008/1/2 |
|
0.15.1 |
2007/11/27 |
|
0.14.4 |
2007/11/26 |
|
0.15.0 |
2007/10/29 |
|
0.14.3 |
2007/10/19 |
|
0.14.1 |
2007/9/4 |
|
完整请浏览:http://hadoop.apache.org/releases.html。
推荐使用批量操作工具:mooon_ssh、mooon_upload和mooon_download安装部署,可以提升操作效率(https://github.com/eyjian/mooon/tree/master/mooon/tools),采用CMake编译,依赖OpenSSL(https://www.openssl.org/)和libssh2(http://www.libssh2.org)两个库,其中libssh2也依赖OpenSSL。
共5台机器(zookeeper部署在这5台机器上),部署如下表所示:
NameNode |
JournalNode |
DataNode |
ZooKeeper |
10.148.137.143 10.148.137.204 |
10.148.137.143 10.148.137.204 10.148.138.11 |
10.148.138.11 10.148.140.14 10.148.140.15 |
10.148.137.143 10.148.137.204 10.148.138.11 10.148.140.14 10.148.140.15 |
机器IP |
对应的主机名 |
10.148.137.143 |
hadoop-137-143 |
10.148.137.204 |
hadoop-137-204 |
10.148.138.11 |
hadoop-138-11 |
10.148.140.14 |
hadoop-140-14 |
10.148.140.15 |
hadoop-140-15 |
注意主机名不能有下划线,否则启动时,SecondaryNameNode节点会报如下所示的错误(取自hadoop-hadoop-secondarynamenode-VM_39_166_sles10_64.out文件):
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /data/hadoop/hadoop-2.8.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. Exception in thread "main" java.lang.IllegalArgumentException: The value of property bind.address must not be null at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.conf.Configuration.set(Configuration.java:971) at org.apache.hadoop.conf.Configuration.set(Configuration.java:953) at org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:391) at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:344) at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:104) at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:292) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:264) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:192) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:651) |
命令hostname不但可以查看主机名,还可以用它来修改主机名,格式为:hostname 新主机名。
在修改之前172.25.40.171对应的主机名为VM-40-171-sles10-64,而172.25.39.166对应的主机名为VM_39_166_sles10_64。两者的主机名均带有下划线,因此需要修改。为求简单,仅将原下划线改成横线:
hostname VM-40-171-sles10-64
hostname VM-39-166-sles10-64
经过上述修改后,还不够,类似于修改环境变量,还需要通过修改系统配置文件做永久修改。
不同的Linux发行版本,对应的系统配置文件可能不同,SuSE 10.1是/etc/HOSTNAME:
# cat /etc/HOSTNAME VM_39_166_sles10_64 |
将文件中的“VM_39_166_sles10_64”,改成“VM-39-166-sles10-64”。有些Linux发行版本对应的可能是/etc/hostname文件,有些可能是/etc/sysconfig/network文件。
不但所在文件不同,修改的方法可能也不一样,比如有些是名字对形式,如/etc/sysconfig/network格式为:HOSTNAME=主机名。
修改之后,需要重启网卡,以使修改生效,执行命令:/etc/rc.d/boot.localnet start(不同系统命令会有差异,这是SuSE上的方法),再次使用hostname查看,会发现主机名变了。
直接重启系统,也可以使修改生效。
注意修改主机名后,需要重新验证ssh免密码登录,方法为:ssh 用户名@新的主机名。
可以通过以下多处查看机器名:
1) hostname命令(也可以用来修改主机名,但当次仅当次会话有效)
2) cat /proc/sys/kernel/hostname
3) cat /etc/hostname或cat /etc/sysconfig/network(永久性的修改,需要重启)
4) sysctl kernel.hostname(也可以用来修改主机名,但仅重启之前有效)
要求能通过免登录包括使用IP和主机名都能免密码登录:
1) NameNode能免密码登录所有的DataNode
2) 各NameNode能免密码登录自己
3) 各NameNode间能免密码互登录
4) DataNode能免密码登录自己
5) DataNode不需要配置免密码登录NameNode和其它DataNode。
注:免密码登录不是必须的,如果不使用hadoop-daemons.sh等需要ssh、scp的脚本。
修改文件/etc/security/limits.conf,加入以下两行:
* soft nofile 102400 * hard nofile 102400
# End of file |
其中102400为一个进程最大可以打开的文件个数,当与RedisServer的连接数多时,需要设定为合适的值。
修改后,需要重新登录才会生效,如果是crontab,则需要重启crontab,如:service crond restart,有些平台可能是service cron restart。
如果“/proc/sys/vm/overcommit_memory”的值为0,则会表示开启了OOM。可以设置为1关闭OOM,设置方法请参照net.core.somaxconn完成。
为便于讲解,本文约定Hadoop、JDK安装目录如下:
|
安装目录 |
版本 |
说明 |
JDK |
/data/jdk |
1.7.0 |
ln -s /data/jdk1.7.0_55 /data/jdk |
Hadoop |
/data/hadoop/hadoop |
2.8.0 |
ln -s /data/hadoop/hadoop-2.8.0 /data/hadoop/hadoop |
在实际安装部署时,可以根据实际进行修改。
端口 |
作用 |
9000 |
fs.defaultFS,如:hdfs://172.25.40.171:9000 |
9001 |
dfs.namenode.rpc-address,DataNode会连接这个端口 |
50070 |
dfs.namenode.http-address |
50470 |
dfs.namenode.https-address |
50100 |
dfs.namenode.backup.address |
50105 |
dfs.namenode.backup.http-address |
50090 |
dfs.namenode.secondary.http-address,如:172.25.39.166:50090 |
50091 |
dfs.namenode.secondary.https-address,如:172.25.39.166:50091 |
50020 |
dfs.datanode.ipc.address |
50075 |
dfs.datanode.http.address |
50475 |
dfs.datanode.https.address |
50010 |
dfs.datanode.address,DataNode的数据传输端口 |
8480 |
dfs.journalnode.rpc-address,主备NameNode以http方式从这个端口获取edit文件 |
8481 |
dfs.journalnode.https-address |
8032 |
yarn.resourcemanager.address |
8088 |
yarn.resourcemanager.webapp.address,YARN的http端口 |
8090 |
yarn.resourcemanager.webapp.https.address |
8030 |
yarn.resourcemanager.scheduler.address |
8031 |
yarn.resourcemanager.resource-tracker.address |
8033 |
yarn.resourcemanager.admin.address |
8042 |
yarn.nodemanager.webapp.address |
8040 |
yarn.nodemanager.localizer.address |
8188 |
yarn.timeline-service.webapp.address |
10020 |
mapreduce.jobhistory.address |
19888 |
mapreduce.jobhistory.webapp.address |
2888 |
ZooKeeper,如果是Leader,用来监听Follower的连接 |
3888 |
ZooKeeper,用于Leader选举 |
2181 |
ZooKeeper,用来监听客户端的连接 |
16010 |
hbase.master.info.port,HMaster的http端口 |
16000 |
hbase.master.port,HMaster的RPC端口 |
60030 |
hbase.regionserver.info.port,HRegionServer的http端口 |
60020 |
hbase.regionserver.port,HRegionServer的RPC端口 |
8080 |
hbase.rest.port,HBase REST server的端口 |
10000 |
hive.server2.thrift.port |
9083 |
hive.metastore.uris |
模块 |
RPC端口 |
HTTP端口 |
HTTPS端口 |
HDFS JournalNode |
8485 |
8480 |
8481 |
HDFS NameNode |
8020 |
50070 |
|
HDFS DataNode |
50020 |
50075 |
|
HDFS SecondaryNameNode |
|
50090 |
50091 |
Yarn Resource Manager |
8032 |
8088 |
8090 |
Yarn Node Manager |
8040 |
8042 |
|
Yarn SharedCache |
|
8788 |
|
HMaster |
|
16010 |
|
HRegionServer |
|
16030 |
|
HBase thrift |
9090 |
9095 |
|
HBase rest |
|
8085 |
|
注:DataNode通过端口50010传输数据。
为运行Hadoop(HDFS、YARN和MapReduce)需要完成的工作详单:
JDK安装 |
Hadoop是Java语言开发的,所以需要。 |
免密码登录 |
NameNode控制SecondaryNameNode和DataNode使用了ssh和scp命令,需要无密码执行。 |
Hadoop安装和配置 |
这里指的是HDFS、YARN和MapReduce,不包含HBase、Hive等的安装。 |
本文安装的JDK 1.7.0版本。
JDK最新二进制安装包下载网址:
http://www.oracle.com/technetwork/java/javase/downloads
JDK1.7二进制安装包下载网址:
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
本文下载的是64位Linux版本的JDK1.7:jdk-7u55-linux-x64.gz。请不要安装JDK1.8版本,JDK1.8和Hadoop 2.8.0不匹配,编译Hadoop 2.8.0源码时会报很多错误。
JDK的安装非常简单,将jdk-7u55-linux-x64.gz上传到Linux,然后解压,接着配置好环境变量即可(本文jdk-7u55-linux-x64.gz被上传在/data目录下):
1) 进入/data目录
2) 解压安装包:tar xzf jdk-7u55-linux-x64.gz,解压后会在生成目录/data/jdk1.7.0_55
3) 建立软件链接:ln -s /data/jdk1.7.0_55 /data/jdk
4) 修改/etc/profile或用户目录下的profile,或同等文件,配置如下所示环境变量:
export JAVA_HOME=/data/jdk export CLASSPATH=$JAVA_HOME/lib/tools.jar export PATH=$JAVA_HOME/bin:$PATH |
完成这项操作之后,需要重新登录,或source一下profile文件,以便环境变量生效,当然也可以手工运行一下,以即时生效。如果还不放心,可以运行下java或javac,看看命令是否可执行。如果在安装JDK之前,已经可执行了,则表示不用安装JDK。
以下针对的是ssh2,而不是ssh,也不包括OpenSSH。配置分两部分:一是对登录机的配置,二是对被登录机的配置,其中登录机为客户端,被登录机为服务端,也就是解决客户端到服务端的无密码登录问题。下述涉及到的命令,可以直接拷贝到Linux终端上执行,已全部验证通过,操作环境为SuSE 10.1。
第一步,修改所有被登录机上的sshd配置文件/etc/ssh2/sshd2_config:
1) (如果不以root用户运行hadoop,则跳过这一步)将PermitRootLogin值设置为yes,也就是取掉前面的注释号#
2) 将AllowedAuthentications值设置为publickey,password,也就是取掉前面的注释号#
3) 重启sshd服务:service ssh2 restart
第二步,在所有登录机上,执行以下步骤:
1) 进入到.ssh2目录:cd ~/.ssh2
2) ssh-keygen2 -t dsa -P''
-P表示密码,-P''就表示空密码,也可以不用-P参数,但这样就要敲三次回车键,用-P''就一次回车。
成功之后,会在用户的主目录下生成私钥文件id_dsa_2048_a,和公钥文件id_dsa_2048_a.pub。
3) 生成identification文件:echo "IdKey id_dsa_2048_a" >> identification,请注意IdKey后面有一个空格,确保identification文件内容如下:
# cat identification IdKey id_dsa_2048_a |
4) 将文件id_dsa_2048_a.pub,上传到所有被登录机的~/.ssh2目录:scp id_dsa_2048_a.pub [email protected]:/root/.ssh2,这里假设192.168.0.1为其中一个被登录机的IP。在执行scp之前,请确保192.168.0.1上有/root/.ssh2这个目录,而/root/需要修改为root用户的实际HOME目录,通常环境变量$HOME为用户主目录,~也表示用户主目录,不带任何参数的cd命令也会直接切换到用户主目录。
第三步,在所有被登录机上,执行以下步骤:
1) 进入到.ssh2目录:cd ~/.ssh2
2) 生成authorization文件:echo "Key id_dsa_2048_a.pub" >> authorization,请注意Key后面有一个空格,确保authorization文件内容如下:
# cat authorization Key id_dsa_2048_a.pub |
完成上述工作之后,从登录机到被登录机的ssh登录就不需要密码了。如果没有配置好免密码登录,在启动时会遇到如下错误:
Starting namenodes on [172.25.40.171] 172.25.40.171: Host key not found from database. 172.25.40.171: Key fingerprint: 172.25.40.171: xofiz-zilip-tokar-rupyb-tufer-tahyc-sibah-kyvuf-palik-hazyt-duxux 172.25.40.171: You can get a public key's fingerprint by running 172.25.40.171: % ssh-keygen -F publickey.pub 172.25.40.171: on the keyfile. 172.25.40.171: warning: tcgetattr failed in ssh_rl_set_tty_modes_for_fd: fd 1: Invalid argument |
或下列这样的错误:
Starting namenodes on [172.25.40.171] 172.25.40.171: hadoop's password: |
建议生成的私钥和公钥文件名都带上自己的IP,否则会有些混乱。
按照中免密码登录范围的说明,配置好所有的免密码登录。更多关于免密码登录说明,请浏览技术博客:
1) http://blog.chinaunix.net/uid-20682147-id-4212099.html(两个SSH2间免密码登录)
2) http://blog.chinaunix.net/uid-20682147-id-4212097.html(SSH2免密码登录OpenSSH)
3) http://blog.chinaunix.net/uid-20682147-id-4212094.html(OpenSSH免密码登录SSH2)
4) http://blog.chinaunix.net/uid-20682147-id-5520240.html(两个openssh间免密码登录)
本部分仅包括HDFS、MapReduce和Yarn的安装,不包括HBase、Hive等的安装。
Hadoop二进制安装包下载网址:http://hadoop.apache.org/releases.html#Download(或直接进入http://mirror.bit.edu.cn/apache/hadoop/common/进行下载),本文下载的是hadoop-2.8.0版本(安装包:
http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.8.0/hadoop-2.8.0.tar.gz,源码包:http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.8.0/hadoop-2.8.0-src.tar.gz)。
官方的安装说明请浏览Cluster Setup:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html。
1) 将Hadoop安装包hadoop-2.8.0.tar.gz上传到/data/hadoop目录下
2) 进入/data/hadoop目录
3) 在/data/hadoop目录下,解压安装包hadoop-2.8.0.tar.gz:tar xzf hadoop-2.8.0.tar.gz
4) 建立软件链接:ln -s /data/hadoop/hadoop-2.8.0 /data/hadoop/hadoop
5) 修改用户主目录下的文件.profile(当然也可以是/etc/profile或其它同等效果的文件),设置Hadoop环境变量:
export JAVA_HOME=/data/jdk export HADOOP_HOME=/data/hadoop/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export PATH=$HADOOP_HOME/bin:$PATH |
需要重新登录以生效,或者在终端上执行:export HADOOP_HOME=/data/hadoop/hadoop也可以即时生效。
修改所有节点上的$HADOOP_HOME/etc/hadoop/hadoop-env.sh文件,在靠近文件头部分加入:export JAVA_HOME=/data/jdk
特别说明一下:虽然在/etc/profile已经添加了JAVA_HOME,但仍然得修改所有节点上的hadoop-env.sh,否则启动时,报如下所示的错误:
10.12.154.79: Error: JAVA_HOME is not set and could not be found. 10.12.154.77: Error: JAVA_HOME is not set and could not be found. 10.12.154.78: Error: JAVA_HOME is not set and could not be found. 10.12.154.78: Error: JAVA_HOME is not set and could not be found. 10.12.154.77: Error: JAVA_HOME is not set and could not be found. 10.12.154.79: Error: JAVA_HOME is not set and could not be found. |
除JAVA_HOME之外,再添加:
export HADOOP_HOME=/data/hadoop/hadoop
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
同时,建议将下列添加到/etc/profile或~/.profile中:
export JAVA_HOME=/data/jdk
export HADOOP_HOME=/data/hadoop/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
为省去不必要的麻烦,建议在所有节点的/etc/hosts文件,都做如下所配置:
10.148.137.143 hadoop-137-143 # NameNode 10.148.137.204 hadoop-137-204 # NameNode 10.148.138.11 hadoop-138-11 # DataNode 10.148.140.14 hadoop-140-14 # DataNode 10.148.140.15 hadoop-140-15 # DataNode |
注意不要为一个IP配置多个不同主机名,否则HTTP页面可能无法正常运作。
主机名,如VM-39-166-sles10-64,可通过hostname命令取得。由于都配置了主机名,在启动HDFS或其它之前,需要确保针对主机名进行过ssh,否则启动时,会遇到如下所示的错误:
VM-39-166-sles10-64: Host key not found from database. VM-39-166-sles10-64: Key fingerprint: VM-39-166-sles10-64: xofiz-zilip-tokar-rupyb-tufer-tahyc-sibah-kyvuf-palik-hazyt-duxux VM-39-166-sles10-64: You can get a public key's fingerprint by running VM-39-166-sles10-64: % ssh-keygen -F publickey.pub VM-39-166-sles10-64: on the keyfile. VM-39-166-sles10-64: warning: tcgetattr failed in ssh_rl_set_tty_modes_for_fd: fd 1: Invalid argument |
上述错误表示没有以主机名ssh过一次VM-39-166-sles10-64。按下列方法修复错误:
ssh hadoop@VM-39-166-sles10-64 Host key not found from database. Key fingerprint: xofiz-zilip-tokar-rupyb-tufer-tahyc-sibah-kyvuf-palik-hazyt-duxux You can get a public key's fingerprint by running % ssh-keygen -F publickey.pub on the keyfile. Are you sure you want to continue connecting (yes/no)? yes Host key saved to /data/hadoop/.ssh2/hostkeys/key_36000_137vm_13739_137166_137sles10_13764.pub host key for VM-39-166-sles10-64, accepted by hadoop Thu Apr 17 2014 12:44:32 +0800 Authentication successful. Last login: Thu Apr 17 2014 09:24:54 +0800 from 10.32.73.69 Welcome to SuSE Linux 10 SP2 64Bit Nov 10,2010 by DIS Version v2.6.20101110 No mail. |
这些脚本使用到了slaves:
hadoop-daemons.sh slaves.sh start-dfs.sh stop-dfs.sh yarn-daemons.sh |
这些脚本都依赖无密码SSH,如果没有使用到,则可以不管slaves文件。
slaves即为HDFS的DataNode节点。当使用脚本start-dfs.sh来启动hdfs时,会使用到这个文件,以无密码登录方式到各slaves上启动DataNode。
修改主NameNode和备NameNode上的$HADOOP_HOME/etc/hadoop/slaves文件,将slaves的节点IP(也可以是相应的主机名)一个个加进去,一行一个IP,如下所示:
> cat slaves 10.148.138.11 10.148.140.14 10.148.140.15 |
配置文件放在$HADOOP_HOME/etc/hadoop目录下,对于Hadoop 2.3.0、Hadoop 2.8.0和Hadoop 2.8.0版本,该目录下的core-site.xml、yarn-site.xml、hdfs-site.xml和mapred-site.xml都是空的。如果不配置好就启动,如执行start-dfs.sh,则会遇到各种错误。
可从$HADOOP_HOME/share/hadoop目录下拷贝一份到/etc/hadoop目录,然后在此基础上进行修改(以下内容可以直接拷贝执行,2.3.0版本中各default.xml文件路径不同于2.8.0版本):
# 进入$HADOOP_HOME目录 cd $HADOOP_HOME cp ./share/doc/hadoop/hadoop-project-dist/hadoop-common/core-default.xml ./etc/hadoop/core-site.xml cp ./share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml ./etc/hadoop/hdfs-site.xml cp ./share/doc/hadoop/hadoop-yarn/hadoop-yarn-common/yarn-default.xml ./etc/hadoop/yarn-site.xml cp ./share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml ./etc/hadoop/mapred-site.xml |
接下来,需要对默认的core-site.xml、yarn-site.xml、hdfs-site.xml和mapred-site.xml进行适当的修改,否则仍然无法启动成功。
QJM的配置参照的官方文档:
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html。
对hdfs-site.xml文件的修改,涉及下表中的属性:
属性名 |
属性值 |
说明 |
dfs.nameservices |
mycluster |
|
dfs.ha.namenodes.mycluster |
nn1,nn2 |
同一nameservice下,只能配置一或两个,也就是不能有nn3了 |
dfs.namenode.rpc-address.mycluster.nn1 |
hadoop-137-143:8020 |
|
dfs.namenode.rpc-address.mycluster.nn2 |
hadoop-137-204:8020 |
|
dfs.namenode.http-address.mycluster.nn1 |
hadoop-137-143:50070 |
|
dfs.namenode.http-address.mycluster.nn2 |
hadoop-137-204:50070 |
|
dfs.namenode.shared.edits.dir |
qjournal://hadoop-137-143:8485;hadoop-137-204:8485;hadoop-138-11:8485/mycluster |
至少三台Quorum Journal节点配置 |
|
|
|
dfs.client.failover.proxy.provider.mycluster |
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider |
客户端通过它来找主NameNode |
|
|
|
dfs.ha.fencing.methods |
sshfence
如果配置为sshfence,当主NameNode异常时,使用ssh登录到主NameNode,然后使用fuser将主NameNode杀死,因此需要确保所有NameNode上可以使用fuser。 |
用来保证同一时刻只有一个主NameNode,以防止脑裂。可带用户名和端口参数,格式示例:sshfence([[username][:port]]);值还可以为shell脚本,格式示例: shell(/path/to/my/script.sh arg1 arg2 ...),如: shell(/bin/true)
如果sshd不是默认的22端口时,就需要指定。 |
dfs.ha.fencing.ssh.private-key-files |
/data/hadoop/.ssh2/id_dsa_2048_a |
指定私钥,如果是OpenSSL,则值为/data/hadoop/.ssh/id_rsa |
dfs.ha.fencing.ssh.connect-timeout |
30000 |
可选的配置 |
|
|
|
dfs.journalnode.edits.dir |
/data/hadoop/hadoop/journal |
这个不要带前缀“file://”,JournalNode存储其本地状态的位置,在JouralNode机器上的绝对路径,JNs的edits和其他本地状态将被存储在此处。此处如果带前缀,则会报“Journal dir should be an absolute path” |
|
|
|
dfs.datanode.data.dir |
file:///data/hadoop/hadoop/data |
请带上前缀“file://”,不要全配置成SSD类型,否则写文件时会遇到错误“Failed to place enough replicas” |
dfs.namenode.name.dir |
|
请带上前缀“file://”,NameNode元数据存放目录,默认值为file://${hadoop.tmp.dir}/dfs/name,也就是在临时目录下,可以考虑放到数据目录下 |
dfs.namenode.checkpoint.dir |
|
默认值为file://${hadoop.tmp.dir}/dfs/namesecondary,但如果没有启用SecondaryNameNode,则不需要 |
dfs.ha.automatic-failover.enabled |
true |
自动主备切换 |
|
|
|
dfs.datanode.max.xcievers |
4096 |
可选修改,类似于linux的最大可打开的文件个数,默认为256,建议设置成大一点。同时,需要保证系统可打开的文件个数足够(可通过ulimit命令查看)。该错误会导致hbase报“notservingregionexception”。 |
dfs.journalnode.rpc-address |
0.0.0.0:8485 |
配置JournalNode的RPC端口号,默认为0.0.0.0:8485,可以不用修改 |
dfs.hosts |
|
可选配置,但建议配置,以防止其它DataNode无意中连接进来。用于配置DataNode白名单,只有在白名单中的DataNode才能连接NameNode。dfs.hosts的值为一本地文件绝对路径,如:/data/hadoop/etc/hadoop/hosts.include |
dfs.hosts.exclude |
|
正常不要填写,需要下线DataNode时用到。dfs.hosts.exclude的值为本地文件的绝对路径,文件内容为每行一个需要下线的DataNode主机名或IP地址,如:/data/hadoop/etc/hadoop/hosts.exclude |
dfs.namenode.num.checkpoints.retained |
2 |
默认为2,指定NameNode保存fsImage文件的个数 |
dfs.namenode.num.extra.edits.retained |
1000000 |
Edit文件保存个数 |
dfs.namenode.max.extra.edits.segments.retained |
10000 |
|
dfs.datanode.scan.period.hours |
|
默认为504小时 |
dfs.blockreport.intervalMsec |
|
DataNode向NameNode报告块信息的时间间隔,默认值为21600000毫秒 |
dfs.datanode.directoryscan.interval |
|
DataNode进行内存和磁盘数据集块校验,更新内存中的信息和磁盘中信息的不一致情况,默认值为21600秒 |
dfs.heartbeat.interval |
3 |
向NameNode发心跳的间隔,单位:秒 |
详细配置可参考:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml。
对core-site.xml文件的修改,涉及下表中的属性:
属性名 |
属性值 |
说明 |
fs.defaultFS |
hdfs://mycluster |
|
fs.default.name |
hdfs://mycluster |
按理应当不用填写这个参数,因为fs.defaultFS已取代它,但启动时报错: fs.defaultFS is file:/// |
hadoop.tmp.dir |
/data/hadoop/hadoop/tmp |
|
|
|
|
ha.zookeeper.quorum |
hadoop-137-143:2181,hadoop-138-11:2181,hadoop-140-14:2181 |
|
ha.zookeeper.parent-znode |
/mycluster/hadoop-ha |
|
io.seqfile.local.dir |
|
默认值为${hadoop.tmp.dir}/io/local |
fs.s3.buffer.dir |
|
默认值为${hadoop.tmp.dir}/s3 |
fs.s3a.buffer.dir |
|
默认值为${hadoop.tmp.dir}/s3a |
注意启动之前,需要将配置的目录创建好,如创建好/data/hadoop/current/tmp目录。详细可参考:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xm。
如果没有配置,则启动时报如下错误:
Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured. |
这里需要指定IP和端口,如果只指定了IP,如<value>10.148.137.143</value>,则启动时输出如下:
Starting namenodes on [] |
改成“<value>hadoop-137-143:8020</value>”后,则启动时输出为:
Starting namenodes on [10.148.137.143] |
对hdfs-site.xml文件的修改,涉及下表中的属性:
属性名 |
属性值 |
涉及范围 |
mapreduce.framework.name |
yarn |
所有mapreduce节点 |
详细配置可参考:
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml。
对yarn-site.xml文件的修改,涉及下表中的属性:
属性名 |
属性值 |
涉及范围 |
yarn.resourcemanager.hostname |
0.0.0.0 |
ResourceManager NodeManager HA模式可不配置,但由于其它配置项可能有引用它,建议保持值为0.0.0.0,如果没有被引用到,则可不配置。 |
yarn.nodemanager.hostname |
0.0.0.0 |
|
yarn.nodemanager.aux-services |
mapreduce_shuffle |
|
以下为HA相关的配置,包括自动切换(可仅可在ResourceManager节点上配置) |
||
yarn.resourcemanager.ha.enabled |
true |
启用HA |
yarn.resourcemanager.cluster-id |
yarn-cluster |
可不同于HDFS的 |
yarn.resourcemanager.ha.rm-ids |
rm1,rm2 |
注意NodeManager要和ResourceManager一样配置 |
yarn.resourcemanager.hostname.rm1 |
hadoop-137-143 |
|
yarn.resourcemanager.hostname.rm2 |
hadoop-137-204 |
|
yarn.resourcemanager.webapp.address.rm1 |
hadoop-137-143:8088 |
|
yarn.resourcemanager.webapp.address.rm2 |
hadoop-137-204:8088 |
|
yarn.resourcemanager.zk-address |
hadoop-137-143:2181,hadoop-137-204:2181,hadoop-138-11:2181 |
|
yarn.resourcemanager.ha.automatic-failover.enable |
true |
可不配置,因为当yarn.resourcemanager.ha.enabled为true时,它的默认值即为true |
以下为NodeManager配置 |
||
yarn.nodemanager.vmem-pmem-ratio |
|
每使用1MB物理内存,最多可用的虚拟内存数,默认值为2.1,在运行spark-sql时如果遇到“Yarn application has already exited with state FINISHED”,则应当检查NodeManager的日志,以查看是否该配置偏小原因 |
yarn.nodemanager.resource.cpu-vcores |
|
NodeManager总的可用虚拟CPU个数,默认值为8 |
yarn.nodemanager.resource.memory-mb |
|
该节点上YARN可使用的物理内存总量,默认是8192(MB) |
yarn.nodemanager.pmem-check-enabled |
|
是否启动一个线程检查每个任务正使用的物理内存量,如果任务超出分配值,则直接将其杀掉,默认是true |
yarn.nodemanager.vmem-check-enabled |
|
是否启动一个线程检查每个任务正使用的虚拟内存量,如果任务超出分配值,则直接将其杀掉,默认是true |
以下为ResourceManager配置 |
||
yarn.scheduler.minimum-allocation-mb |
|
单个容器可申请的最小内存 |
yarn.scheduler.maximum-allocation-mb |
|
单个容器可申请的最大内存 |
|
|
|
yarn.nodemanager.hostname如果配置成具体的IP,如10.12.154.79,则会导致每个NodeManager的配置不同。详细配置可参考:
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml。
Yarn HA的配置可以参考:
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html。
Zookeeper -> JournalNode -> 格式化NameNode -> 初始化JournalNode -> 创建命名空间(zkfc) -> NameNode -> 主备切换进程 -> DataNode -> ResourceManager -> NodeManager |
但请注意首次启动NameNode之前,得先做format,也请注意备NameNode的启动方法。主备切换进程的启动只需要在“创建命名空间(zkfc)”之后即可。
在启动HDFS之前,需要先完成对NameNode的格式化。
./zkServer.sh start
注意在启动其它之前先启动zookeeper。
这一步和格式化NameNode、实始化JournalNode无顺序关系。在其中一个namenode上执行:./hdfs zkfc -formatZK
成功后,将在ZooKeer上创建core-site.xml中ha.zookeeper.parent-znode指定的路径。如果有修改hdfs-site.xml中的dfs.ha.namenodes.mycluster值,则需要重新做一次formatZK,否则自动主备NameNode切换将失效。zkfc进程的日志文件将发现如下信息(假设nm1改成了nn1):
Unable to determine service address for namenode 'nm1' |
注意如果有修改dfs.ha.namenodes.mycluster,上层的HBase等依赖HBase的也需要重启。
NameNode将元数据操作日志记录在JournalNode上,主备NameNode通过记录在JouralNode上的日志完成元数据同步。
在所有JournalNode上执行(注意是两个参数,在“hdfs namenode -format”之后做这一步):
./hadoop-daemon.sh start journalnode
注意,在执行“hdfs namenode -format”之前,必须先启动好JournalNode,而format又必须在启动namenode之前。
注意只有新的,才需要做这一步,而且只需要在主NameNode上执行。
1) 进入$HADOOP_HOME/bin目录
2) 进行格式化:./hdfs namenode -format
如果完成有,输出包含“INFO util.ExitUtil: Exiting with status 0”,则表示格式化成功。
在进行格式化时,如果没有在/etc/hosts文件中添加主机名和IP的映射:“172.25.40.171 VM-40-171-sles10-64”,则会报如下所示错误:
14/04/17 03:44:09 WARN net.DNS: Unable to determine local hostname -falling back to "localhost" java.net.UnknownHostException: VM-40-171-sles10-64: VM-40-171-sles10-64: unknown error at java.net.InetAddress.getLocalHost(InetAddress.java:1484) at org.apache.hadoop.net.DNS.resolveLocalHostname(DNS.java:264) at org.apache.hadoop.net.DNS.<clinit>(DNS.java:57) at org.apache.hadoop.hdfs.server.namenode.NNStorage.newBlockPoolID(NNStorage.java:945) at org.apache.hadoop.hdfs.server.namenode.NNStorage.newNamespaceInfo(NNStorage.java:573) at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:144) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:845) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1370) Caused by: java.net.UnknownHostException: VM-40-171-sles10-64: unknown error at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:907) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302) at java.net.InetAddress.getLocalHost(InetAddress.java:1479) ... 8 more |
这一步需要在格式化NameNode之后进行!
如果是非HA转HA才需要这一步,在其中一个JournalNode上执行:
./hdfs namenode -initializeSharedEdits
此命令默认是交互式的,加上参数-force转成非交互式。
在所有JournalNode创建如下目录:
mkdir -p /data/hadoop/hadoop/journal/mycluster/current
如果此步在格式化NameNode前运行,则会报错“NameNode is not formatted”。
1) 进入$HADOOP_HOME/sbin目录
2) 启动主NameNode:
./hadoop-daemon.sh start namenode
启动时,遇到如下所示的错误,则表示NameNode不能免密码登录自己。如果之前使用IP可以免密码登录自己,则原因一般是因为没有使用主机名登录过自己,因此解决办法是使用主机名SSH一下,比如:ssh hadoop@VM_40_171_sles10_64,然后再启动。
Starting namenodes on [VM_40_171_sles10_64] VM_40_171_sles10_64: Host key not found from database. VM_40_171_sles10_64: Key fingerprint: VM_40_171_sles10_64: xofiz-zilip-tokar-rupyb-tufer-tahyc-sibah-kyvuf-palik-hazyt-duxux VM_40_171_sles10_64: You can get a public key's fingerprint by running VM_40_171_sles10_64: % ssh-keygen -F publickey.pub VM_40_171_sles10_64: on the keyfile. VM_40_171_sles10_64: warning: tcgetattr failed in ssh_rl_set_tty_modes_for_fd: fd 1: Invalid argument |
1) ./hdfs namenode -bootstrapStandby
2) ./hadoop-daemon.sh start namenode
如果没有执行第1步,直接启动会遇到如下错误:
No valid image files found
或者在该NameNode日志会发现如下错误:
2016-04-08 14:08:39,745 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: NameNode is not formatted.
在所有NameNode上启动主备切换进程:
./hadoop-daemon.sh start zkfc
只有启动了DFSZKFailoverController进程,HDFS才能自动切换主备。
注:zkfc是zookeeper failover controller的缩写。
在各个DataNode上分别执行:
./hadoop-daemon.sh start datanode
如果有发现DataNode进程并没有起来,可以试试删除logs目录下的DataNode日志,再得启看看。
1) 使用JDK提供的jps命令,查看相应的进程是否已启动
2) 检查$HADOOP_HOME/logs目录下的log和out文件,看看是否有异常信息。
启动后nn1和nn2都处于备机状态,将nn1切换为主机:
./hdfs haadmin -transitionToActive nn1
执行jps命令(注:jps是jdk中的一个命令,不是jre中的命令),可看到DataNode进程:
$ jps 18669 DataNode 24542 Jps |
执行jps命令,可看到NameNode进程:
$ jps 18669 NameNode 24542 Jps |
执行HDFS命令,以进一步检验是否已经安装成功和配置好。关于HDFS命令的用法,直接运行命令hdfs或hdfs dfs,即可看到相关的用法说明。
hdfs dfsadmin -report
注意如果core-site.xml中的配置项fs.default.name的值为file:///,则会报:
report: FileSystem file:/// is not an HDFS file system
Usage: hdfs dfsadmin [-report] [-live] [-dead] [-decommissioning]
解决这个问题,只需要将fs.default.name的值设置为和fs.defaultFS相同的值。
如查看NameNode1和NameNode2分别是主还是备:
$ hdfs haadmin -getServiceState nn1 standby $ hdfs haadmin -getServiceState nn2 active |
“hdfs dfs -ls”带一个参数,如果参数以“hdfs://URI”打头表示访问HDFS,否则相当于ls。其中URI为NameNode的IP或主机名,可以包含端口号,即hdfs-site.xml中“dfs.namenode.rpc-address”指定的值。
“hdfs dfs -ls”要求默认端口为8020,如果配置成9000,则需要指定端口号,否则不用指定端口,这一点类似于浏览器访问一个URL。示例:
> hdfs dfs -ls hdfs:///172.25.40.171:9001/ |
9001后面的斜杠/是和必须的,否则被当作文件。如果不指定端口号9001,则使用默认的8020,“172.25.40.171:9001”由hdfs-site.xml中“dfs.namenode.rpc-address”指定。
不难看出“hdfs dfs -ls”可以操作不同的HDFS集群,只需要指定不同的URI。
文件上传后,被存储在DataNode的data目录下(由DataNode的hdfs-site.xml中的属性“dfs.datanode.data.dir”指定),如:
$HADOOP_HOME/data/current/BP-139798373-172.25.40.171-1397735615751/current/finalized/blk_1073741825
文件名中的“blk”是block,即块的意思,默认情况下blk_1073741825即为文件的一个完整块,Hadoop未对它进额外处理。
上传文件命令,示例:
> hdfs dfs -put /etc/SuSE-release hdfs:///172.25.40.171:9001/ |
删除文件命令,示例:
> hdfs dfs -rm hdfs://172.25.40.171:9001/SuSE-release Deleted hdfs://172.25.40.171:9001/SuSE-release |
hdfs haadmin -failover --forcefence --forceactive nn1 nn2 # 让nn2成为主NameNode |
注:hadoop-3.0版本将支持多备NameNode,类似于HBase那样。
如果试图配置三个NameNode,如:
<property> <name>dfs.ha.namenodes.test</name> <value>nn1,nn2,nn3</value> <description> The prefix for a given nameservice, contains a comma-separated list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE). </description> </property> |
则运行“hdfs namenode -bootstrapStandby”时会报如下错误,表示在同一NameSpace内不能超过2个NameNode:
16/04/11 09:51:57 ERROR namenode.NameNode: Failed to start namenode. java.io.IOException: java.lang.IllegalArgumentException: Expected exactly 2 NameNodes in namespace 'test'. Instead, got only 3 (NN ids were 'nn1','nn2','nn3' at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.run(BootstrapStandby.java:425) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1454) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554) Caused by: java.lang.IllegalArgumentException: Expected exactly 2 NameNodes in namespace 'test'. Instead, got only 3 (NN ids were 'nn1','nn2','nn3' at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115) |
示例:start-balancer.sh –t 10%
10%表示机器与机器之间磁盘使用率偏差小于10%时认为均衡,否则做均衡搬动。“start-balancer.sh”调用“hdfs start balancer”来做均衡,可以调用stop-balancer.sh停止均衡。
均衡过程非常慢,但是均衡过程中,仍能够正常访问HDFS,包括往HDFS上传文件。
[VM2016@hadoop-030 /data4/hadoop/sbin]$ hdfs balancer # 可以改为调用start-balancer.sh 16/04/08 14:26:55 INFO balancer.Balancer: namenodes = [hdfs://test] // test为HDFS的cluster名 16/04/08 14:26:55 INFO balancer.Balancer: parameters = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.231:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.229:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.213:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.208:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.232:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.207:50010 16/04/08 14:26:56 INFO balancer.Balancer: 5 over-utilized: [192.168.1.231:50010:DISK, 192.168.1.229:50010:DISK, 192.168.1.213:50010:DISK, 192.168.1.208:50010:DISK, 192.168.1.232:50010:DISK] 16/04/08 14:26:56 INFO balancer.Balancer: 1 underutilized(未充分利用的): [192.168.1.207:50010:DISK] # 数据将移向该节点 16/04/08 14:26:56 INFO balancer.Balancer: Need to move 816.01 GB to make the cluster balanced. # 需要移动816.01G数据达到平衡 16/04/08 14:26:56 INFO balancer.Balancer: Decided to move 10 GB bytes from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK # 从192.168.1.231移动10G数据到192.168.1.207 16/04/08 14:26:56 INFO balancer.Balancer: Will move 10 GB in this iteration
16/04/08 14:32:58 INFO balancer.Dispatcher: Successfully moved blk_1073749366_8542 with size=77829046 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:32:59 INFO balancer.Dispatcher: Successfully moved blk_1073749386_8562 with size=77829046 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.231:50010 16/04/08 14:33:34 INFO balancer.Dispatcher: Successfully moved blk_1073749378_8554 with size=77829046 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.231:50010 16/04/08 14:34:38 INFO balancer.Dispatcher: Successfully moved blk_1073749371_8547 with size=134217728 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:34:54 INFO balancer.Dispatcher: Successfully moved blk_1073749395_8571 with size=134217728 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.231:50010 Apr 8, 2016 2:35:01 PM 0 478.67 MB 816.01 GB 10 GB 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.213:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.229:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.232:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.231:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.208:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.207:50010 16/04/08 14:35:10 INFO balancer.Balancer: 5 over-utilized: [192.168.1.213:50010:DISK, 192.168.1.229:50010:DISK, 192.168.1.232:50010:DISK, 192.168.1.231:50010:DISK, 192.168.1.208:50010:DISK] 16/04/08 14:35:10 INFO balancer.Balancer: 1 underutilized(未充分利用的): [192.168.1.207:50010:DISK] 16/04/08 14:35:10 INFO balancer.Balancer: Need to move 815.45 GB to make the cluster balanced. 16/04/08 14:35:10 INFO balancer.Balancer: Decided to move 10 GB bytes from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK 16/04/08 14:35:10 INFO balancer.Balancer: Will move 10 GB in this iteration
16/04/08 14:41:18 INFO balancer.Dispatcher: Successfully moved blk_1073760371_19547 with size=77829046 from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:41:19 INFO balancer.Dispatcher: Successfully moved blk_1073760385_19561 with size=77829046 from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:41:22 INFO balancer.Dispatcher: Successfully moved blk_1073760393_19569 with size=77829046 from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:41:23 INFO balancer.Dispatcher: Successfully moved blk_1073760363_19539 with size=77829046 from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 |
hdfs fsck hdfs:///tmp/slaves -files -locations -blocks
hdfs dfsadmin -safemode leave
hdfs fsck -delete
如果是扩容,将已有JournalNode的current目录打包到新机器的“dfs.journalnode.edits.dir”指定的相同位置下。
为保证扩容和缩容JournalNode成功,需要先将NameNode和JournalNode全停止掉,再修改配置,然后在启动JournalNode成功后(日志停留在“IPC Server listener on 8485: starting”处),再启动NameNode,否则可能遇到如下这样的错误:
org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException: Can't write, no segment open |
找一台已有JournalNode节点,修改它的hdfs-site.xml,将新增的Journal包含进来,如在
<value>qjournal://hadoop-030:8485;hadoop-031:8485;hadoop-032:8485/test</value>
的基础上新增hadoop-033和hadoop-034两个JournalNode:
<value>qjournal://hadoop-030:8485;hadoop-031:8485;hadoop-032:8485;hadoop-033:8485;hadoop-034:8485/test</value>
然后将安装目录和数据目录(hdfs-site.xml中的dfs.journalnode.edits.dir指定的目录)都复制到新的节点。
如果不复制JournalNode的数据目录,则新节点上的JournalNode可能会报错“Journal Storage Directory /data/journal/test not formatted”,将来的版本可能会实现自动同步。ZooKeeper的扩容不需要复制已有节点的data和datalog,而且也不能这样操作。
接下来,就可以在新节点上启动好JournalNode(不需要做什么初始化),并重启下NameNode。注意观察JournalNode日志,查看是否启动成功,当日志显示为以下这样的INFO级别日志则表示启动成功:
2016-04-26 10:31:11,160 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/journal/test/current/edits_inprogress_0000000000000194269 -> /data/journal/test/current/edits_0000000000000194269-0000000000000194270
但只能出现如下的日志,才表示工作正常:
2017-05-18 15:22:42,901 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8485: starting 2017-05-18 15:23:27,028 INFO org.apache.hadoop.hdfs.qjournal.server.JournalNode: Initializing journal in directory /data/journal/data/test 2017-05-18 15:23:27,042 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/journal/data/test/in_use.lock acquired by nodename 15259@hadoop-40 2017-05-18 15:23:27,057 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Scanning storage FileJournalManager(root=/data/journal/data/test) 2017-05-18 15:23:27,152 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Latest log is EditLogFile(file=/data/journal/data/test/current/edits_inprogress_0000000000027248811,first=0000000000027248811,last=0000000000027248811,inProgress=true,hasCorruptHeader=false) |
记得更换NameNode后,需要重新执行“hdfs zkfc -formatZK”,否则将不能自动主备切换。
当有NameNode机器损坏时,必然存在新NameNode来替代。把配置修改成指向新NameNode,然后以备机形式启动新NameNode,这样新的NameNode即加入到Cluster中:
1) ./hdfs namenode -bootstrapStandby 2) ./hadoop-daemon.sh start namenode |
记启动主备切换进程DFSZKFailoverController,否则将不能自动做主备切换!!!
新的NameNode通过bootstrapStandby操作从主NameNode拉取fsImage(hadoop-091:50070为主NameNode):
17/04/24 14:25:32 INFO namenode.TransferFsImage: Opening connection to http://hadoop-091:50070/imagetransfer?getimage=1&txid=2768127&storageInfo=-63:2009831148:1492719902489:CID-5b2992bb-4dcb-4211-8070-6934f4d232a8&bootstrapstandby=true 17/04/24 14:25:32 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds 17/04/24 14:25:32 INFO namenode.TransferFsImage: Transfer took 0.01s at 28461.54 KB/s 17/04/24 14:25:32 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000002768127 size 379293 bytes. |
如果没有足够多的DataNode连接到NameNode,则NameNode也会进入safe模式,下面的信息显示只有0台DataNodes连接到了NameNode。
原因有可能是因为修改了dfs.ha.namenodes.mycluster的值,导致DataNode不认识,比如将nm1改成了nn1等,这个时候还需要重新formatZK,否则自动主备切换将失效。
如果DataNode上的配置也同步修改了,但修改后未重启,则需要重启DataNode:
Safe mode is ON. The reported blocks 0 needs additional 12891 blocks to reach the threshold 0.9990 of total blocks 12904. The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached. |
兼容的版本,可以跨版本扩容,比如由Hadoop-2.7.2扩容Hadoop-2.8.0。扩容方法为在新增的机器上安装和配置好DataNode,在成功启动DataNode后,在主NameNode上执行命令:bin/hdfs dfsadmin -refreshNodes,即完成扩容。
如要数据均衡到新加入的机器,执行命令:sbin/start-balancer.sh,可带参数-threshold,默认值为10,如:sbin/start-balancer.sh -threshold 5。参数-threshold的取值范围为0~100。
balancer命令可在NameNode和DataNode上执行,但最好在新增机器或空闲机器上执行。
参数-threshold的值表示节点存储使用率和集群存储使用率间的关系,如果节点的存储使用率小于集群存储的使用率,则执行balance操作。
约束:本操作需要在主NameNode上进行,即状态为active的NameNode上进行!!!如果备NameNode也运行着,建议备的hdfs-site.xml也做同样修改,以防止下线过程中发现主备NameNode切换,或者干脆停掉备NameNode。
下线完成后,记得将hdfs-site.xml修改回来(即将dfs.hosts.exclude值恢复为空值),如果不修改回来,那被下线掉的DataNode将一直处于Decommissioned状态,同时还得做一次“/data/hadoop/bin/hdfs dfsadmin -refreshNodes”,否则被下线的DataNode一直处于Decommissioned状态。
下线后,只要配置了dfs.hosts,即使被下线的DataNode进程未停掉,也不会再连接进来,而且这是推荐的方式,以防止外部的DataNode无意中连接进来。
但在将dfs.hosts.exclude值恢复为空值之前,需要将已下线的所有DataNode进程停掉,最好还设置hdfs-site.xml中的dfs.hosts值,以限制可以连接NameNode的DataNode,不然一不小心,被下线的DataNode又连接上来了,切记!另外注意,如果有用到slaves文件,也需要slaves同步修改。
修改主NameNode的hdfs-site.xml,设置dfs.hosts.exclude的值,值为一文件的全路径,如:/home/hadoop/etc/hadoop/hosts.exclude。文件内容为需要下线(即删除)的DataNode的机器名或IP,每行一个机器名或IP(注意暂不要将下线的DataNode从slaves中剔除)。
修改完hdfs-site.xml后,在主NameNode上执行:bin/hdfs dfsadmin -refreshNodes,以刷新DataNode,下线完成后可同扩容一样做下balance。
使用命令bin/hdfs dfsadmin -report或web界面可以观察下线的DataNode退役(Decommissioning)状态。完成后,将下线的DataNode从slaves中剔除。
下线前的状态:
$ hdfs dfsadmin -report Name: 192.168.31.33:50010 (hadoop-33) Hostname: hadoop-33 Decommission Status : Normal Configured Capacity: 3247462653952 (2.95 TB) DFS Used: 297339283 (283.56 MB) Non DFS Used: 165960652397 (154.56 GB) DFS Remaining: 3081204662272 (2.80 TB) DFS Used%: 0.01% DFS Remaining%: 94.88% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed Apr 19 18:03:33 CST 2017 |
下线进行中的的状态:
$ hdfs dfsadmin -report Name: 192.168.31.33:50010 (hadoop-33) Hostname: hadoop-33 Decommission Status : Decommission in progress Configured Capacity: 3247462653952 (2.95 TB) DFS Used: 297339283 (283.56 MB) Non DFS Used: 165960652397 (154.56 GB) DFS Remaining: 3081204662272 (2.80 TB) DFS Used%: 0.01% DFS Remaining%: 94.88% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 16 Last contact: Thu Apr 20 09:00:48 CST 2017 |
下线完成后的状态:
$ hdfs dfsadmin -report Name: 192.168.31.33:50010 (hadoop-33) Hostname: hadoop-33 Decommission Status : Decommissioned Configured Capacity: 1935079350272 (1.76 TB) DFS Used: 257292167968 (239.62 GB) Non DFS Used: 99063741175 (92.26 GB) DFS Remaining: 1578723441129 (1.44 TB) DFS Used%: 13.30% DFS Remaining%: 81.58% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 13 Last contact: Thu Apr 20 09:29:00 CST 2017 |
如果长时间处于“Decommission In Progress”状态,而不能转换成Decommissioned状态,这个时候可用“hdfs fsck”检查下。
成功下线后,还需要将该节点从slaves中删除,以及dfs.hosts.exclude中剔除,然后再做一下:bin/hdfs dfsadmin -refreshNodes。
在扩容过程中,有可能遇到DataNode启动时未向NameNode上报block信息。正常时,NameNode都会通过心跳响应的方式告诉DataNode上报block,但当NameNode和DataNode版本不一致等时,可能会使这个机制失效。搜索DataNode的日志文件,将搜索不到上报信息日志“sent block report”。
这个时候,一旦重启NameNode,就会出现大量“missing block”。幸好HDFS提供了工具,可以直接强制DataNode上报block,方法为:
hdfs dfsadmin -triggerBlockReport 192.168.31.26:50020 |
上述192.168.31.26为DataNode的IP地址,50020为DataNode的RPC端口。最终应当保持DataNode和NameNode版本一致,不然得每次做一下这个操作,而且可能还有其它问题存在。
如果不能自动主备切换,检查下是否有其它的ResourceManager正占用着ZooKeeper。
1) 进入$HADOOP_HOME/sbin目录
2) 在主备两台都执行:start-yarn.sh,即开始启动YARN
若启动成功,则在Master节点执行jps,可以看到ResourceManager:
> jps 24689 NameNode 30156 Jps 28861 ResourceManager |
在Slaves节点执行jps,可以看到NodeManager:
$ jps 14019 NodeManager 23257 DataNode 15115 Jps |
如果只需要单独启动指定节点上的ResourceManager,这样:
./yarn-daemon.sh start resourcemanager
对于NodeManager,则是这样:
./yarn-daemon.sh start nodemanager
列举YARN集群中的所有NodeManager,如(注意参数间的空格,直接执行yarn可以看到使用帮助):
> yarn node -list Total Nodes:3 Node-Id Node-State Node-Http-Address Number-of-Running-Containers localhost:45980 RUNNING localhost:8042 0 localhost:47551 RUNNING localhost:8042 0 localhost:58394 RUNNING localhost:8042 0 |
查看指定NodeManager的状态,如:
> yarn node -status localhost:47551 Node Report : Node-Id : localhost:47551 Rack : /default-rack Node-State : RUNNING Node-Http-Address : localhost:8042 Last-Health-Update : 星期五 18/四月/14 01:45:41:555GMT Health-Report : Containers : 0 Memory-Used : 0MB Memory-Capacity : 8192MB CPU-Used : 0 vcores CPU-Capacity : 8 vcores |
查看rm1的主备状态,即查看它是主(active)还是备(standby)。
将rm1从主切为备。
更多的yarn命令可以参考:
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html。
在安装目录的share/hadoop/mapreduce子目录下,有现存的示例程序:
hadoop@VM-40-171-sles10-64:~/hadoop> ls share/hadoop/mapreduce hadoop-mapreduce-client-app-2.8.0.jar hadoop-mapreduce-client-jobclient-2.8.0-tests.jar hadoop-mapreduce-client-common-2.8.0.jar hadoop-mapreduce-client-shuffle-2.8.0.jar hadoop-mapreduce-client-core-2.8.0.jar hadoop-mapreduce-examples-2.8.0.jar hadoop-mapreduce-client-hs-2.8.0.jar lib hadoop-mapreduce-client-hs-plugins-2.8.0.jar lib-examples hadoop-mapreduce-client-jobclient-2.8.0.jar sources |
跑一个示例程序试试:
hdfs dfs -put /etc/hosts hdfs:///test/in/ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar wordcount hdfs:///test/in/ hdfs:///test/out/ |
运行过程中,使用java的jps命令,可以看到yarn启动了名为YarnChild的进程。
wordcount运行完成后,结果会保存在out目录下,保存结果的文件名类似于“part-r-00000”。另外,跑这个示例程序有两个需求注意的点:
1) in目录下要有文本文件,或in即为被统计的文本文件,可以为HDFS上的文件或目录,也可以为本地文件或目录
2) out目录不能存在,程序会自动去创建它,如果已经存在则会报错。
包hadoop-mapreduce-examples-2.8.0.jar中含有多个示例程序,不带参数运行,即可看到用法:
> hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar wordcount Usage: wordcount <in> <out>
> hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar An example program must be given as the first argument. Valid program names are: aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files. aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files. bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi. dbcount: An example job that count the pageview counts from a database. distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi. grep: A map/reduce program that counts the matches of a regex in the input. join: A job that effects a join over sorted, equally partitioned datasets multifilewc: A job that counts words from several files. pentomino: A map/reduce tile laying program to find solutions to pentomino problems. pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method. randomtextwriter: A map/reduce program that writes 10GB of random textual data per node. randomwriter: A map/reduce program that writes 10GB of random data per node. secondarysort: An example defining a secondary sort to the reduce. sort: A map/reduce program that sorts the data written by the random writer. sudoku: A sudoku solver. teragen: Generate data for the terasort terasort: Run the terasort teravalidate: Checking results of terasort wordcount: A map/reduce program that counts the words in the input files. wordmean: A map/reduce program that counts the average length of the words in the input files. wordmedian: A map/reduce program that counts the median length of the words in the input files. wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files. |
修改日志级别为DEBBUG,并打屏:
export HADOOP_ROOT_LOGGER=DEBUG,console
dfs.permissions.enabled = true dfs.permissions.superusergroup = supergroup dfs.cluster.administrators = ACL-for-admins dfs.namenode.acls.enabled = true dfs.web.ugi = webuser,webgroup |
fs.permissions.umask-mode = 022 hadoop.security.authentication = simple 安全验证规则,可为simple或kerberos |
// g++ -g -o x x.cpp -L$JAVA_HOME/lib/amd64/jli -ljli -L$JAVA_HOME/jre/lib/amd64/server -ljvm -I$HADOOP_HOME/include $HADOOP_HOME/lib/native/libhdfs.a -lpthread -ldl #include "hdfs.h" #include <stdio.h> #include <stdlib.h> #include <string.h>
int main(int argc, char **argv) { #if 0 hdfsFS fs = hdfsConnect("default", 0); // HA方式 const char* writePath = "hdfs://mycluster/tmp/testfile.txt"; hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY |O_CREAT, 0, 0, 0); if(!writeFile) { fprintf(stderr, "Failed to open %s for writing!\n", writePath); exit(-1); } const char* buffer = "Hello, World!\n"; tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, strlen(buffer)+1); if (hdfsFlush(fs, writeFile)) { fprintf(stderr, "Failed to 'flush' %s\n", writePath); exit(-1); } hdfsCloseFile(fs, writeFile); #else struct hdfsBuilder* bld = hdfsNewBuilder(); hdfsBuilderSetNameNode(bld, "default"); // HA方式 hdfsFS fs = hdfsBuilderConnect(bld); if (NULL == fs) { fprintf(stderr, "Failed to connect hdfs\n"); exit(-1); } int num_entries = 0; hdfsFileInfo* entries; if (argc < 2) entries = hdfsListDirectory(fs, "/", &num_entries); else entries = hdfsListDirectory(fs, argv[1], &num_entries); fprintf(stdout, "num_entries: %d\n", num_entries); for (int i=0; i<num_entries; ++i) { fprintf(stdout, "%s\n", entries[i].mName); } hdfsFreeFileInfo(entries, num_entries); hdfsDisconnect(fs); //hdfsFreeBuilder(bld); #endif return 0; } |
运行之前需要设置好CLASSPATH,如果设置不当,可能会遇到不少困难,比如期望操作HDFS上的文件和目录,却变成了本地的文件和目录,如者诸于“java.net.UnknownHostException”类的错误等。
为避免出现错误,强烈建议使用命令“hadoop classpath --glob”取得正确的CLASSPATH值。
另外还需要设置好libjli.so和libjvm.so两个库的LD_LIBRARY_PATH,如:
export LD_LIBRARY_PATH=$JAVA_HOME/lib/amd64/jli:$JAVA_HOME/jre/lib/amd64/server:$LD_LIBRARY_PATH |
Hadoop提供了fsImage和Edit查看工具,分别为oiv和oev,使用示例:
hdfs oiv -i fsimage_0000000000000001953 -p XML -o x.xml hdfs oev -i edits_0000000000000001054-0000000000000001055 -o x x.xml |
借助工具,可以编辑修改fsImage或Edit文件,做数据修复。
主备NameNode通过QJM同步数据。QJM的数据目录由参数dfs.namenode.name.dir决定,NameNode的数据目录由dfs.journalnode.edits.dir决定。
QJM通过一致复制协议生成日志文件(即edit文件),日志文件名示例:
edits_0000000000024641811-0000000000024641854
所以节点的日志文件是完全相同的,即拥有相同的MD5值,主备NameNode从QJM取日志文件,并存在自己的数据目录,因此所有QJM节点和主备NameNode上的日志文件是完全相同的。
备NameNode会定期将日志文件合并成fsImage文件,并将fsImage同步给主NameNode,因此正常情况下主备NameNode间的fsImage文件也是完全相同的。如果出现不同,有可能主备NameNode间数据出现了不一致,或者是因为备NameNode刚好生成新的fsImage但还未同步给主NameNode。
默认备NameNode一小时合并一次edit文件生成新的fsImage文件,并只保留最近两个fsImage:
下面显示开始合并edit文件,生成新的fsImage文件(为3600秒,即1小时,实际由hdfs-site.xml中的dfs.namenode.checkpoint.period决定): 2017-04-21 15:35:44,994 INFO org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering checkpoint because it has been 3600 seconds since the last checkpoint, which exceeds the configured interval 3600 2017-04-21 15:35:44,994 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Save namespace ... 2017-04-21 15:35:45,022 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Going to retain 2 images with txid >= 24641535
下面显示删除上上一个fsImage文件fsimage_0000000000024638036 2017-04-21 15:35:45,022 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging old image FSImageFile(file=/data5/namenode/current/fsimage_0000000000024638036, cpktTxId=0000000000024638036)
下面显示向主NameNode上传新的fsImage文件花了0.142秒 2017-04-21 15:35:45,239 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with txid 24647473 to namenode at http://hadoop-030:50070 in 0.142 seconds 2017-04-21 15:36:38,528 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode hadoop-030/10.143.136.207:8020 2017-04-21 15:36:38,587 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@5854eec9 expecting start txid #24647474 |
备NameNode会上传最新的fsImage给主NameNode:
2017-04-21 15:35:45,119 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took 0.01s at 56923.08 KB/s
下面显示已下载了最新的fsImage文件,文件名将是fsimage_0000000000024647473 2017-04-21 15:35:45,119 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000024647473 size 758002 bytes.
下面显示保留2个fsImage文件 2017-04-21 15:35:45,126 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Going to retain 2 images with txid >= 24641535
下面显示删除上上一个fsImage文件fsimage_0000000000024638036 2017-04-21 15:35:45,126 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging old image FSImageFile(file=/data5/namenode/current/fsimage_0000000000024638036, cpktTxId=0000000000024638036) 2017-04-21 15:35:45,236 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Purging remote journals older than txid 23641536 2017-04-21 15:35:45,236 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Purging logs older than 23641536
下面显示删除较老的edit文件 2017-04-21 15:35:45,244 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging old edit log EditLogFile(file=/data5/namenode/current/edits_0000000000023641041-0000000000023641120,first=0000000000023641041,last=0000000000023641120,inProgress=false,hasCorruptHeader=false) |
1) 执行“hdfs dfs -ls”时报ConnectException
原因可能是指定的端口号9000不对,该端口号由hdfs-site.xml中的属性“dfs.namenode.rpc-address”指定,即为NameNode的RPC服务端口号。
文件上传后,被存储在DataNode的data(由DataNode的hdfs-site.xml中的属性“dfs.datanode.data.dir”指定)目录下,如:
$HADOOP_HOME/data/current/BP-139798373-172.25.40.171-1397735615751/current/finalized/blk_1073741825
文件名中的“blk”是block,即块的意思,默认情况下blk_1073741825即为文件的一个完整块,Hadoop未对它进额外处理。
hdfs dfs -ls hdfs://172.25.40.171:9000 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /data/hadoop/hadoop-2.8.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. 14/04/17 12:04:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/04/17 12:04:03 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/04/17 12:04:03 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. ls: Call From VM-40-171-sles10-64/172.25.40.171 to VM-40-171-sles10-64:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused |
2) Initialization failed for Block pool
可能是因为对NameNode做format之前,没有清空DataNode的data目录。
3) Incompatible clusterIDs
“Incompatible clusterIDs”的错误原因是在执行“hdfs namenode -format”之前,没有清空DataNode节点的data目录。
网上一些文章和帖子说是tmp目录,它本身也是没问题的,但Hadoop 2.8.0是data目录,实际上这个信息已经由日志的“/data/hadoop/hadoop-2.8.0/data”指出,所以不能死死的参照网上的解决办法,遇到问题时多仔细观察。
从上述描述不难看出,解决办法就是清空所有DataNode的data目录,但注意不要将data目录本身给删除了。 data目录由core-site.xml文件中的属性“dfs.datanode.data.dir”指定。
2014-04-17 19:30:33,075 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/hadoop/hadoop-2.8.0/data/in_use.lock acquired by nodename 28326@localhost 2014-04-17 19:30:33,078 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (Datanode Uuid unassigned) service to /172.25.40.171:9001 java.io.IOException: Incompatible clusterIDs in /data/hadoop/hadoop-2.8.0/data: namenode clusterID = CID-50401d89-a33e-47bf-9d14-914d8f1c4862; datanode clusterID = CID-153d6fcb-d037-4156-b63a-10d6be224091 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:472) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:225) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:249) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:929) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:900) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:274) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:815) at java.lang.Thread.run(Thread.java:744) 2014-04-17 19:30:33,081 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to /172.25.40.171:9001 2014-04-17 19:30:33,184 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143) at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.remove(BlockPoolManager.java:91) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:859) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:619) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:837) at java.lang.Thread.run(Thread.java:744) 2014-04-17 19:30:33,184 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned) 2014-04-17 19:30:33,184 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:861) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:619) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:837) at java.lang.Thread.run(Thread.java:744) 2014-04-17 19:30:35,185 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode 2014-04-17 19:30:35,187 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0 2014-04-17 19:30:35,189 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at localhost/127.0.0.1 ************************************************************/ |
4) Inconsistent checkpoint fields
SecondaryNameNode中的“Inconsistent checkpoint fields”错误原因,可能是因为没有设置好SecondaryNameNode上core-site.xml文件中的“hadoop.tmp.dir”。
2014-04-17 11:42:18,189 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Log Size Trigger :1000000 txns 2014-04-17 11:43:18,365 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint java.io.IOException: Inconsistent checkpoint fields. LV = -56 namespaceID = 1384221685 cTime = 0 ; clusterId = CID-319b9698-c88d-4fe2-8cb2-c4f440f690d4 ; blockpoolId = BP-1627258458-172.25.40.171-1397735061985. Expecting respectively: -56; 476845826; 0; CID-50401d89-a33e-47bf-9d14-914d8f1c4862; BP-2131387753-172.25.40.171-1397730036484. at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:135) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:518) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:383) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:349) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:345) at java.lang.Thread.run(Thread.java:744)
另外,也请配置好SecondaryNameNode上hdfs-site.xml中的“dfs.datanode.data.dir”为合适的值: <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop/current/tmp</value> <description>A base for other temporary directories.</description> </property> |
5) fs.defaultFS is file:///
在core-site.xml中,当只填写了fs.defaultFS,而fs.default.name为默认的file:///时,会报此错误。解决方法是设置成相同的值。
6) a shared edits dir must not be specified if HA is not enabled
该错误可能是因为hdfs-site.xml中没有配置dfs.nameservices或dfs.ha.namenodes.mycluster。
7) /tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
只需按日志中提示的,创建好相应的目录。
8) The auxService:mapreduce_shuffle does not exist
问题原因是没有配置yarn-site.xml中的“yarn.nodemanager.aux-services”,将它的值配置为mapreduce_shuffle,然后重启yarn问题即解决。记住所有yarn节点都需要修改,包括ResourceManager和NodeManager,如果NodeManager上的没有修改,仍然会报这个错误。
9) org.apache.hadoop.ipc.Client: Retrying connect to server
该问题,有可能是因为NodeManager中的yarn-site.xml和ResourceManager上的不一致,比如NodeManager没有配置yarn.resourcemanager.ha.rm-ids。
10) mapreduce.Job: Running job: job_1445931397013_0001
Hadoop提交mapreduce任务时,卡在mapreduce.Job: Running job: job_1445931397013_0001处。
问题原因可能是因为yarn的NodeManager没起来,可以用jdk的jps确认下。
该问题也有可能是因为NodeManager中的yarn-site.xml和ResourceManager上的不一致,比如NodeManager没有配置yarn.resourcemanager.ha.rm-ids。
11) Could not format one or more JournalNodes
执行“./hdfs namenode -format”时报“Could not format one or more JournalNodes”。
可能是hdfs-site.xml中的dfs.namenode.shared.edits.dir配置错误,比如重复了,如:
<value>qjournal://hadoop-168-254:8485;hadoop-168-254:8485;hadoop-168-253:8485;hadoop-168-252:8485;hadoop-168-251:8485/mycluster</value>
修复后,重启JournalNode,问题可能就解决了。
12) org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Already in standby state
遇到这个错误,可能是yarn-site.xml中的yarn.resourcemanager.webapp.address配置错误,比如配置成了两个yarn.resourcemanager.webapp.address.rm1,实际应当是yarn.resourcemanager.webapp.address.rm1和yarn.resourcemanager.webapp.address.rm2。
13) No valid image files found
如果是备NameNode,执行下“hdfs namenode -bootstrapStandby”再启动。
2015-12-01 15:24:39,535 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.io.FileNotFoundException: No valid image files found
at org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.getLatestImages(FSImageTransactionalStorageInspector.java:165)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:623)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:584)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:644)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
2015-12-01 15:24:39,536 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-12-01 15:24:39,539 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
14) xceivercount 4097 exceeds the limit of concurrent xcievers 4096
此错误的原因是hdfs-site.xml中的配置项“dfs.datanode.max.xcievers”值4096过小,需要改大一点。该错误会导致hbase报“notservingregionexception”。
16/04/06 14:30:34 ERROR namenode.NameNode: Failed to start namenode.
15) java.lang.IllegalArgumentException: Unable to construct journal, qjournal://hadoop-030:8485;hadoop-031:8454;hadoop-032
执行“hdfs namenode -format”遇到上述错误时,是因为hdfs-site.xml中的配置dfs.namenode.shared.edits.dir配置错误,其中的hadoop-032省了“:8454”部分。
16) Bad URI 'qjournal://hadoop-030:8485;hadoop-031:8454;hadoop-032:8454': must identify journal in path component
是因为配置hdfs-site.xml中的“dfs.namenode.shared.edits.dir”时,路径少带了cluster名。
17) 16/04/06 14:48:19 INFO ipc.Client: Retrying connect to server: hadoop-032/10.143.136.211:8454. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
检查hdfs-site.xml中的“dfs.namenode.shared.edits.dir”值,JournalNode默认端口是8485,不是8454,确认是否有写错。JournalNode端口由hdfs-site.xml中的配置项dfs.journalnode.rpc-address决定。
18) Exception in thread "main" org.apache.hadoop.HadoopIllegalArgumentException: Could not get the namenode ID of this node. You may run zkfc on the node other than namenode.
执行“hdfs zkfc -formatZK”遇到上面这个错误,是因为还没有执行“hdfs namenode -format”。NameNode ID是在“hdfs namenode -format”时生成的。
19) 2016-04-06 17:08:07,690 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory [DISK]file:/data3/datanode/data/ has already been used.
以非root用户启动DataNode,但启动不了,在它的日志文件中发现如下错误信息:
2016-04-06 17:08:07,707 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-418073539-10.143.136.207-1459927327462
2016-04-06 17:08:07,707 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to analyze storage directories for block pool BP-418073539-10.143.136.207-1459927327462
java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /data3/datanode/data/current/BP-418073539-10.143.136.207-1459927327462
继续寻找,会发现还存在如何错误提示:
Invalid dfs.datanode.data.dir /data3/datanode/data:
EPERM: Operation not permitted
使用命令“ls -l”检查目录/data3/datanode/data的权限设置,发现owner为root,原因是因为之前使用root启动过DataNode,将owner改过来即可解决此问题。
20) 2016-04-06 18:00:26,939 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: hadoop-031/10.143.136.208:8020
DataNode的日志文件不停地记录如下日志,是因为DataNode将作为主NameNode,但实际上10.143.136.208并没有启动,主NameNode不是它。这个并不表示DataNode没有起来,而是因为DataNode会同时和主NameNode和备NameNode建立心跳,当备NameNode没有起来时,有这些日志是正常现象。
2016-04-06 18:00:32,940 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-031/10.143.136.208:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-04-06 17:55:44,555 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Namenode Block pool BP-418073539-10.143.136.207-1459927327462 (Datanode Uuid 2d115d45-fd48-4e86-97b1-e74a1f87e1ca) service to hadoop-030/10.143.136.207:8020 trying to claim ACTIVE state with txid=1
“trying to claim ACTIVE state”出自于hadoop/hdfs/server/datanode/BPOfferService.java中的updateActorStatesFromHeartbeat()。
2016-04-06 17:55:49,893 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-031/10.143.136.208:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
“Retrying connect to server”出自于hadoop/ipc/Client.java中的handleConnectionTimeout()和handleConnectionFailure()。
21) ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
如果遇到这个错误,请检查NodeManager日志,如果发现有如下所示信息:
WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Container [pid=26665,containerID=container_1461657380500_0020_02_000001] is running beyond virtual memory limits. Current usage: 345.0 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
则表示需要增大yarn-site.xmk的配置项yarn.nodemanager.vmem-pmem-ratio的值,该配置项默认值为2.1。
16/10/13 10:23:19 ERROR client.TransportClient: Failed to send RPC 7614640087981520382 to /10.143.136.231:34800: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
16/10/13 10:23:19 ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(0,0,Map()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 7614640087981520382 to /10.143.136.231:34800: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
22) java.net.SocketException: Unresolved address
可能是在非NameNode上启动NameNode:
java.net.SocketException: Call From cluster to null:0 failed on socket exception: java.net.SocketException: Unresolved address
23) should be specified as a URI in configuration files
请在dfs.namenode.name.dir、dfs.journalnode.edits.dir和dfs.datanode.data.dir配置的路径前加上前缀“file://”:
common.Util: Path /home/namenode/data should be specified as a URI in configuration files. Please update hdfs configuration.
如:
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/namenode/data</value>
</property>
24) Failed to place enough replicas
如果将DataNode的dfs.datanode.data.dir全配置成SSD类型,则执行“hdfs dfs -put /etc/hosts hdfs:///tmp/”时会报如下错误:
2017-05-04 16:08:22,545 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology 2017-05-04 16:08:22,545 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 3 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) 2017-05-04 16:08:22,545 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable: unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]} 2017-05-04 16:08:22,545 INFO org.apache.hadoop.ipc.Server: IPC Server handler 37 on 8020, call Call#5 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.208.5.220:40701 java.io.IOException: File /tmp/in/hosts._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 5 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1733) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2496) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:828) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455) |
|