1、下载镜像
可以冲灵雀云仓库中下载镜像,
docker pull registry.alauda.cn/sequenceiq/hadoop-docker
2、查看镜像
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.alauda.cn/sequenceiq/hadoop-docker latest ac0dcee6a740 23 months ago 1.766 GB
3、启动镜像
可以使用-d使用后台模式运行,也可以忽略查看镜像启动过程
docker run -it --name hadoop registry.alauda.cn/sequenceiq/hadoop-docker /etc/bootstrap.sh -bash
执行成功后直接进入容器
bash-4.1#
带-d参数启动后,需要运行容器
docker exec -it hadoop bash
4、进入Hadoop目录
cd $HADOOP_PREFIX bash-4.1# pwd
/usr/local/hadoop
5、创建input目录
bash-4.1# bin/hdfs dfs -mkdir /input bash-4.1# bin/hdfs dfs -chmod -R 777 /input
6、新建输入文本input1.txt,input2.txt,并将文件放入Hadoop的文件系统/input中
bash-4.1# vi input1.txt bash-4.1# bin/hdfs dfs -put input1.txt /input bash-4.1# vi input2.txt bash-4.1# bin/hdfs dfs -put input2.txt /input
input1.txt内容如下
Hello World Application for Apache Hadoop
Hello World and Hello Apache Hadoop
input2.txt内容如下
Hello World
Hello Apache Hadoop
7、查看执行结果
bash-4.1# bin/hdfs dfs -ls /input
Found 2 items
-rw-r--r-- 1 root supergroup 78 2017-06-16 02:31 /input/input1.txt
-rw-r--r-- 1 root supergroup 32 2017-06-16 02:32 /input/input2.txt
8、执行hadoop的mapreduce任务,传入 wordcount及输入输出目录
bash-4.1# bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar wordcount /input /output
output目录在任务执行成功后会自动创建
INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
……
……
File Input Format Counters
Bytes Read=110
File Output Format Counters
Bytes Written=60
9、查看output输出
bash-4.1# bin/hdfs dfs -ls /output
Found 2 items
-rw-r--r-- 1 root supergroup 0 2017-06-16 02:36 /output/_SUCCESS
-rw-r--r-- 1 root supergroup 60 2017-06-16 02:36 /output/part-r-00000
10、查看执行结果
bash-4.1# bin/hdfs dfs -cat /output/part-r-00000
Apache 3
Application 1
Hadoop 3
Hello 5
World 3
and 1
11、退出容器
bash-4.1# exit
12、停止容器
[root@iz2ze7sp5njgaf81ekoudez ~]# docker stop hadoop
hadoop
13、删除容器
[root@iz2ze7sp5njgaf81ekoudez ~]# docker rm hadoop
hadoop