首先下载安装,这个就不说了,去apache官网下载安装,貌似186M,很大
解压后,我们看下目录结构如下:
[root@com23 hadoop-2.6.0]# ll total 64 drwxr-xr-x 2 20000 20000 4096 Nov 14 05:20 bin drwxr-xr-x 3 20000 20000 4096 Nov 14 05:20 etc drwxr-xr-x 2 20000 20000 4096 Nov 14 05:20 include drwxr-xr-x 2 root root 4096 Jan 14 14:52 input drwxr-xr-x 3 20000 20000 4096 Nov 14 05:20 lib drwxr-xr-x 2 20000 20000 4096 Nov 14 05:20 libexec -rw-r--r-- 1 20000 20000 15429 Nov 14 05:20 LICENSE.txt drwxr-xr-x 2 root root 4096 Jan 14 15:23 logs -rw-r--r-- 1 20000 20000 101 Nov 14 05:20 NOTICE.txt drwxr-xr-x 2 root root 4096 Jan 14 14:53 output -rw-r--r-- 1 20000 20000 1366 Nov 14 05:20 README.txt drwxr-xr-x 2 20000 20000 4096 Nov 14 05:20 sbin drwxr-xr-x 4 20000 20000 4096 Nov 14 05:20 share这里补充一个yarn框架与之前mapreduce框架的一个比较: http://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop-yarn/
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
$ mkdir input $ cp etc/hadoop/*.xml input $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+' $ cat output/*下面来看看伪分布模式
涉及到两个配置文件
hadoop-2.6.0/etc/hadoop
core-ste.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>配置好两个配置文件了,这里别忘了配置JAVA_HOME啊
在hadoop-env.sh和yarn-env.sh(如果用到的话,不过要配一起配了)
这里添加一个yarn的配置,mapreduce采用yarn框架的
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>yarn-site.xml
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
下面建立ssh localhost免密码登录
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
1、文件系统格式化
bin/hdfs namenode -format2、启动namenode和datanode
sbin/start-dfs.sh这一步结束,我们就可以打开hadoop的监控页面看看各个模块的情况了:http://localhost:50070
感觉2.6很酷炫啊!!
下面建立文件系统
bin/hdfs dfs -mkdir /user bin/hdfs dfs -mkdir /user/chiwei
执行完了,我们去到页面上观察下
已经出现了我们刚刚创建的文件系统了
sh bin/hdfs dfs -put input /user/chiwei
将input文件夹下的内容放到刚刚创建的文件系统里
sh bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep /user/chiwei/input output 'dfs[a-z.]+'
通过以上命令使用example去分析一下刚刚的文件内容
已经产生输出了
查看内容到hadoop的文件系统去查看,而不是linux的文件系统
[root@com23 hadoop-2.6.0]# sh bin/hdfs dfs -cat /user/root/output/*
最后就是关闭文件系统,datanode,namenode,secondary namenode
[root@com23 hadoop-2.6.0]# sh sbin/stop-dfs.sh 15/01/14 15:56:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Stopping namenodes on [localhost] localhost: stopping namenode localhost: stopping datanode Stopping secondary namenodes [0.0.0.0] 0.0.0.0: stopping secondarynamenode 15/01/14 15:57:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [root@com23 hadoop-2.6.0]#