5个节点,配置如下:
dual core x86_64, 4GB RAM, 10GB Disk
Centos 6.4_x64
OpenJDK 1.7.0_9
hadoop-2.1.0-beta
互相之间千兆网连接。
每台机器上用于安装和启动hadoop的用户名都是xc
节点的hostname、安装的服务和ip如下:
hostname | 安装服务 | ip |
h1-1 | NN | 172.16.0.198 |
h1-2 | RM + SNN | 172.16.0.199 |
h1-3 | NM + DN | 172.16.0.200 |
h1-4 | NM + DN | 172.16.0.201 |
h1-5 | NM + DN | 172.16.0.202 |
172.16.0.198 h1-1 172.16.0.199 h1-2 172.16.0.200 h1-3 172.16.0.201 h1-4 172.16.0.202 h1-5
fs.defaultFS hdfs://h1-1:9000
dfs.namenode.name.dir /home/xc/dfs/name dfs.datanode.data.dir /home/xc/dfs/data dfs.replication 3
虽然这里配置了jobhistory的web端口,但启动hadoop后,访问这个端口没有响应。telnet上面那两个端口也木有响应,暂时不知道为毛,但是不影响hdfs和跑mapreduce。mapreduce.framework.name yarn mapreduce.jobhistory.address h1-2:10020 mapreduce.jobhistory.webapp.address h1-2:19888 mapreduce.jobhistory.intermediate-done-dir /mr-history/tmp mapreduce.jobhistory.done-dir /mr-history/done
yarn.nodemanager.aux-services mapreduce.shuffle The address of the applications manager interface in the RM. yarn.resourcemanager.address h1-2:18040 The address of the scheduler interface. yarn.resourcemanager.scheduler.address h1-2:18030 The address of the RM web application. yarn.resourcemanager.webapp.address h1-2:18088 The address of the resource tracker interface. yarn.resourcemanager.resource-tracker.address h1-2:8025
h1-3 h1-4 h1-5
$ cd hadoop_home_dir $ ./bin/hdfs namenode -format格式化hdfs。格式化后会在namenode节点和slaves节点上建立对应的目录(/home/xc/dfs)
$ cd hadoop_home_dir $ ./sbin/start-all.sh
$ cd hadoop_home_dir $ ./bin/hdfs dfs -put$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.1.0-beta.jar wordcount