1.centos6 中的portmap 改成了rpcbind
use nfs to store file,and erery node mount to get the files
2.修改用户的gid,uid usermod -g 502 -u 502 hadoop
keep all nodes user "hapdoop" is same uid and gid
3.masters和slaves区别:
different between masters and slavers
Typically one machine in the cluster is designated as the NameNode and another machine the as JobTracker, exclusively. These are the masters. The rest of the machines in the cluster act as both DataNode and TaskTracker. These are the slaves.
masters 列的是备份的NameNode机器,masters可以为空
4.
5.将masters配置成无密码ssh登陆各节点:将masters上的id_rsa.pub 添加到各个节点的authorized_keys
setup ssh authorized ,let all slave node access master without password
6.各个节点上的JAVA_HOME的路径设置成hadoop-env.sh里面设定的
setup env JAVA_HOME,should same to the config in hadoop-env.sh
7.nameNode上启动start-dfs.sh启动HDFS, jobTracker上启动start-mapred.sh启动MapReduce
start hdfs on namenode:./start-dfs.sh
start mapreduce on jobTracker:./start-mapred.sh
8.hadoop应用包放在nfs上,各个节点mount到hadoop用户的app目录
the developed app store on nfs,and echo node mount the app directory
9.启动前格式化dfs: hadoop namenode -format,格式的文件存放在/tmp目录下,不保险最好放到别的目录下,不然下次要重新格式化
before use hdfs,need to format first:hadoop namenode -format,the formated file are store in /tmp default and the files may be remove ,better to store in other directory
10.jobTracke的Http地址和端口:http://jobtracker:50030
taskTracke的Http地址和端口:http://slave(1-3):50060
nameNode的Http地址和端口:http://nameNode:50070
dataNode的Http地址和端口:http://dataNode:50075
辅助nameNode的Http地址和端口:http://secondary-nameNode:50090
some http ui address above