hadoop 2.2.0相比于hadoop 0.20.X的安装很不一样。
1、准备三台机器 test01(master),test02(slaves), test03(slaves),并把test01的公钥放到test02/test03的~/.ssh/authorized_keys以打通master和集群的通信。
2、下载hadoop-2.2.0.tar.gz
3、在hadoop 2.2.0进行配置,配置完成后再拷贝到其他slaves上
4、配置文件都在$HADOOP_HOME/etc/hadoop下
a、hadoop-env.sh:
替换exportJAVA_HOME=${JAVA_HOME}为如下:
export JAVA_HOME=Java安装路径
b、core-site.xml:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://test01:9000</value> </property> <property> <name>hadoop.tmp.dirname</name> <value>/root/hadoop/data/</value> </property> </configuration>
注意:
test01的master的ip。
要事先创建好/root/hadoop/data
c、mapred-site.xml:
<configuration> <property> <name>mapred.job.tracker</name> <value>test01:9001</value> </property> </configuration>
d、hdfs-site.xml:
<configuration> <property> <name>dfs.replication</name> <value>2</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property> </configuration>
e、slaves:
test02 test03
5、配置完成后,将test01的hadoop文件夹拷到test02和test03上
6、format
在test01上运行$HADOOP_HOME/bin/hadoop namenode -format
如果执行成功,你会在日志中(倒数几行)找到如下成功的提示信息:
common.Storage: Storage directory /home/hduser/hadoop/tmp/hadoop-hduser/dfs/namehas been successfully formatted.
注意如果在test02又跑了一次format,运行时会报
Incompatible namespaceIDs in /tmp/hadoop-root/dfs/data
的错误,导致DataNode启动不起来,解决方法是删除集群中各台机器的rm -rf /tmp/hadoop-root/* ,然后重新format就行
7、启动hadoop
$HADOOP_HOME/sbin/start-dfs.sh
$HADOOP_HOME/sbin/start-yarn.sh
note:这个版本已经不推荐使用start-all.sh命令
成功完成后:
test01 jps:
11047 ResourceManager 10906 SecondaryNameNode 11966 Jps 10724 NameNode
test02/test03 jps:
6406 NodeManager 6288 DataNode 6508 Jps
使用$HADOOP_HOME/bin/hdfs dfsadmin -report查看hdfs状态:
Configured Capacity: 52840488960 (49.21 GB) Present Capacity: 45652713472 (42.52 GB) DFS Remaining: 45652664320 (42.52 GB) DFS Used: 49152 (48 KB) DFS Used%: 0.00% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 2 (2 total, 0 dead) Live datanodes: Name: 192.168.2.106:50010 (test02) Hostname: test02 Decommission Status : Normal Configured Capacity: 26420244480 (24.61 GB) DFS Used: 24576 (24 KB) Non DFS Used: 3575242752 (3.33 GB) DFS Remaining: 22844977152 (21.28 GB) DFS Used%: 0.00% DFS Remaining%: 86.47% Last contact: Sat Nov 16 12:39:27 CST 2013 Name: 192.168.2.99:50010 (test03) Hostname: test03 Decommission Status : Normal Configured Capacity: 26420244480 (24.61 GB) DFS Used: 24576 (24 KB) Non DFS Used: 3612532736 (3.36 GB) DFS Remaining: 22807687168 (21.24 GB) DFS Used%: 0.00% DFS Remaining%: 86.33% Last contact: Sat Nov 16 12:39:27 CST 2013
8、关闭命令
stop-yarn.shstop-dfs.sh
9、hadoop相关dfs命令已经和以前不一样,如hadoop dfs -ls /tmp 要修改成$HADOOP_HOME/bin/hdfs dfs -ls /tmp
10、查看管理页面: http://test01:50070/
遇到的问题:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable