组建三台机器的群集
其实这个不是最好的组建方法。实验为了更好的测试多节点而这样设置。
#useradd hadoop #passwd hadoop
$ssh-keygent -t dsa (我将密码设置为空方便测试。正常环境请安装keychain,keychain安装 ) $cd .ssh $cat cat id_rsa.pub > authorized_keys $chmod 600 authorized_keys (将权限设置为600否者ssh将不读取公钥信息)
$ssh-copy-id slave1 $ssh-copy-id slave2
$vi core-site.xml <configuration> <property> <name>fs.default.name</name> <value>hdfs://192.168.60.149:9000/</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/hadooptmp</value> </property> </configuration>
$vi mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>192.168.60.149:9001</value> </property> <property> <name>mapred.local.dir</name> <value>/usr/local/hadoop/mapred/local</value> </property> <property> <name>mapred.system.dir</name> <value>/tmp/hadoop/mapred/system</value> </property> </configuration>
$vi hdfs-site.xml <configuration> <property> <name>dfs.name.dir</name> <value>/usr/local/hadoop/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/usr/local/hadoop/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
$vi mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>192.168.60.149:9001</value> </property> <property> <name>mapred.local.dir</name> <value>/usr/local/hadoop/mapred/local</value> </property> <property> <name>mapred.system.dir</name> <value>/tmp/hadoop/mapred/system</value> </property> </configuration>
$vi hdfs-site.xml <configuration> <property> <name>dfs.name.dir</name> <value>/usr/local/hadoop/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/usr/local/hadoop/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
$vi masters master
$vi slaves master slave1 slave2
$$bin/hadoop namenode -format
$/usr/local/hadoop/bin/start-all.sh starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-namenode-master.out master: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-datanode-master.out slave2: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-datanode-slave2.out slave1: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-datanode-slave1.out master: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-secondarynamenode-master.out starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-jobtracker-master.out slave1: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-tasktracker-slave1.out slave2: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-tasktracker-slave2.out master: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-yueyang-tasktracker-master.out
$bin/hadoop dfs -mkdir pustest
$bin/hadoop dfs -put conf/hadoop-env.sh pushtest
http://localhost:50070; http://localhost:50030;
hadoop自带一些简单的实例。测试下单词统计功能。 $bin/hadoop jar hadoop-examples-0.20.203.0.jar wordcount pushtest testoutput 运行后将可以在web界面看见job的状态。和完成的状态。 具体单词数量等统计结果要查看 $bin/hadoop fs -ls drwxr-xr-x - hadoop supergroup 0 2011-07-11 11:13 /user/hadoop/test drwxr-xr-x - hadoop supergroup 0 2011-07-11 11:15 /user/hadoot/testoutput $bin/hadoop fs -ls testoutput Found 3 items -rw-r--r-- 1 hadoop supergroup 0 2011-07-11 16:31 /user/hadoop/shanyang1/_SUCCESS drwxr-xr-x - hadoop supergroup 0 2011-07-11 16:30 /user/hadoop/shanyang1/_logs -rw-r--r-- 1 hadoop supergroup 32897 2011-07-11 16:31 /user/hadoop/shanyang1/part-r-00000 $bin/hadoop fs -cat /user/hadoop/shanyang1/part-r-00000 将可以看到详细的统计信息