练习环境:

操作系统:Ubuntu 16.04 LTS

Hadoop版本:Hadoop 2.7.1

1.配置core-site.xml

hadoop@dblab:/usr/local/hadoop/etc/hadoop$ vim core-site.xml

       

                hadoop.tmp.dir

                file:/usr/local/hadoop/tmp

       

       

                fs.defaultFS

                hdfs://localhost:9000

       

2.配置hdfs-site.xml

 hadoop@dblab:/usr/local/hadoop/etc/hadoop$ vim hdfs-site.xml

       

                dfs.replication

                1

       

       

                dfs.namenode.name.dir

                file:/usr/local/hadoop/tmp/dfs/name

       

       

                dfs.datanode.data.dir

                file:/usr/local/hadoop/tmp/dfs/data

       

3.执行名称节点格式化

hadoop@dblab:/usr/local/hadoop$

hadoop@dblab:/usr/local/hadoop$ ./bin/hdfs namenode –format

 Re-format filesystem in Storage Directory /usr/local/hadoop/tmp/dfs/name ? (Y or N)

 19/05/16 14:23:44 INFO namenode.FSImage: Allocated new BlockPoolId: BP-748770776-127.0.0.1-1557987824492

19/05/16 14:23:44 INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.

19/05/16 14:23:45 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0

19/05/16 14:23:45 INFO util.ExitUtil: Exiting with status 0

19/05/16 14:23:45 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at java.net.UnknownHostException: dblab: dblab: 未知的名称或服务

************************************************************/

解释上述问题:

hadoop@dblab:/usr/local/hadoop$ hostname

dblab

hadoop@dblab:/usr/local/hadoop$ cat /etc/hosts

127.0.0.1       localhost

127.0.1.1       dblab-VirtualBox

# The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

hadoop@dblab:/usr/local/hadoop$ sudo vim /etc/hosts

127.0.1.1       dblab

hadoop@dblab:/usr/local/hadoop$ sudo /etc/init.d/networking restart

4.启动Hadoop

hadoop@dblab:/usr/local/hadoop$ ./sbin/start-dfs.sh

Starting namenodes on [localhost]

localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-namenode-dblab.out

localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-dblab.out

Starting secondary namenodes [0.0.0.0]

0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-dblab.out

#以上是启动失败

hadoop@dblab:/usr/local/hadoop$ ./sbin/start-dfs.sh

Starting namenodes on [localhost]

localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-namenode-dblab.out

localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-dblab.out

Starting secondary namenodes [0.0.0.0]

0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-dblab.out

HDFS的Web管理界面

5.运行Hadoop伪分布式实例

hadoop@dblab:/usr/local/hadoop$ cd /usr/local/hadoop

hadoop@dblab:/usr/local/hadoop$ ./bin/hdfs dfs -mkdir -p /user/hadoop

hadoop@dblab:/usr/local/hadoop$ ./bin/hdfs dfs -mkdir input

hadoop@dblab:/usr/local/hadoop$ ./bin/hdfs dfs -put ./etc/hadoop/*.xml input

hadoop@dblab:/usr/local/hadoop$ ./bin/hdfs dfs -ls input

Found 8 items

-rw-r--r--   1 hadoop supergroup       4436 2019-05-16 14:52 input/capacity-scheduler.xml

-rw-r--r--   1 hadoop supergroup        965 2019-05-16 14:52 input/core-site.xml

-rw-r--r--   1 hadoop supergroup       9683 2019-05-16 14:52 input/hadoop-policy.xml

-rw-r--r--   1 hadoop supergroup       1080 2019-05-16 14:52 input/hdfs-site.xml

-rw-r--r--   1 hadoop supergroup        620 2019-05-16 14:52 input/httpfs-site.xml

-rw-r--r--   1 hadoop supergroup       3518 2019-05-16 14:52 input/kms-acls.xml

-rw-r--r--   1 hadoop supergroup       5511 2019-05-16 14:52 input/kms-site.xml

-rw-r--r--   1 hadoop supergroup        690 2019-05-16 14:52 input/yarn-site.xml

hadoop@dblab:/usr/local/hadoop$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar  grep input output 'dfs[a-z.]+'

hadoop@dblab:/usr/local/hadoop$ ./bin/hdfs dfs -cat output/*

1       dfsadmin

1       dfs.replication

1       dfs.namenode.name.dir

1       dfs.datanode.data.dir

6.关闭Hadoop

hadoop@dblab:/usr/local/hadoop$ ./sbin/stop-dfs.sh   #关闭Hadoop

7.启动YARN

$./sbin/start-dfs.sh

hadoop@dblab:/usr/local/hadoop$ vim  /usr/local/hadoop/etc/hadoop/yarn-site.xml



  yarn.nodemanager.aux-services

  mapreduce_shuffle

 


hadoop@dblab:/usr/local/hadoop$ ./sbin/start-dfs.sh

Starting namenodes on [localhost]

localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-namenode-dblab.out

localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-dblab.out

Starting secondary namenodes [0.0.0.0]

0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-dblab.out

hadoop@dblab:/usr/local/hadoop$ ./sbin/start-dfs.sh 

Starting namenodes on [localhost]

localhost: namenode running as process 28914. Stop it first.

localhost: datanode running as process 29067. Stop it first.

Starting secondary namenodes [0.0.0.0]

0.0.0.0: secondarynamenode running as process 29261. Stop it first.

hadoop@dblab:/usr/local/hadoop$ vim ./etc/hadoop/mapred-site.xml 

 mapreduce.framework.name

 yarn

 

hadoop@dblab:/usr/local/hadoop$ vim ./etc/hadoop/yarn-site.xml 

  yarn.nodemanager.aux-services

  mapreduce_shuffle

 


hadoop@dblab:/usr/local/hadoop$ ./sbin/start-yarn.sh   #启动YARN

starting yarn daemons

starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-resourcemanager-dblab.out

localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-dblab.out

hadoop@dblab:/usr/local/hadoop$ ./sbin/mr-jobhistory-daemon.sh start historyserver  #开启历史服务器

starting historyserver, logging to /usr/local/hadoop/logs/mapred-hadoop-historyserver-dblab.out

hadoop@dblab:/usr/local/hadoop$ jps   #查看进程

29809 ResourceManager

28034 RunJar

28914 NameNode

30424 Jps

30248 JobHistoryServer

29067 DataNode

29933 NodeManager

29261 SecondaryNameNode

通过Web界面查看任务的运行情况

关闭YARN和Hadoop的脚本如下:

hadoop@dblab:/usr/local/hadoop$ ./sbin/stop-yarn.sh

stopping yarn daemons

stopping resourcemanager

localhost: stopping nodemanager

no proxyserver to stop

hadoop@dblab:/usr/local/hadoop$ ./sbin/mr-jobhistory-daemon.sh stop historyserver

stopping historyserver

hadoop@dblab:/usr/local/hadoop$ ./sbin/stop-dfs.sh

Stopping namenodes on [localhost]

localhost: stopping namenode

localhost: stopping datanode

Stopping secondary namenodes [0.0.0.0]

0.0.0.0: stopping secondarynamenode