2、安装ssh:
$ sudo apt-get install ssh $ sudo apt-get install rsync
5、把hadoop-1.1.0.tar.gz解压到/home/user1/hadoop目录下(最好把压缩包解压到用户的目录下,否则有可能出现没有权限的问题)
$sudo tar -zxvf hadoop-1.1.0.tar.gz -C /home/user1/hadoop
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun 改为: export JAVA_HOME=/opt/java/jdk1.7.0_09
$ mkdir input $ cp conf/*.xml input $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
$ cat output/*
$ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa $cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/huangjinhui/hadoop/hadoop-datastore/hadoop-${user.name}</value> </property> <property> <name>fs.default.name</name> <value>hdfs://192.168.1.107:9000</value> </property> </configuration>
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
dfs.replication为默认block复制数量
11、配置conf/mapred-site.xml
<configuration> <property> <name>mapred.job.tracker</name> <value>192.168.1.107:9001</value> </property> </configuration>
12、格式化分布式文件系统
$ bin/hadoop namenode -format
$ bin/start-all.sh
NameNode - http://localhost:50070/ JobTracker - http://localhost:50030/
$ bin/hadoop fs -get output output $ cat output/*