Create a hadoop1.1.1 Cluster on Cent OS 6.5 (host

1. Prepare three Cent Os hosts for poc

10.28.241.174 shuynh-gecko1
10.28.241.172 shuynh-gecko2
10.28.241.175 shuynh-gecko3

root@shuynh-gecko1:~# cat /etc/os-release


2. Get source code of docker-scripts on each node


root@shuynh-gecko1:/# git clone https://github.com/jay-lau/hadoop-docker-master-cluster-111.git
Cloning into 'hadoop-docker-master-cluster-111'...
remote: Counting objects: 16, done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 16 (delta 1), reused 16 (delta 1)
Unpacking objects: 100% (16/16), done.
Checking connectivity... done.

root@shuynh-gecko2:/# git clone https://github.com/jay-lau/hadoop-docker-master-cluster-111.git

root@shuynh-gecko3:/# git clone https://github.com/jay-lau/hadoop-docker-master-cluster-111.git

3. Build Hadoop docker on each node

#enter the folder of  the hadoop-docker-master-cluster-111 scripts.

3.1 build the images on shuynh-gecko1

[root@shuynh-gecko1 ~]#cd /root/gyliu/hadoop-docker-master-111
[root@shuynh-gecko1 hadoop-docker-master-111]# pwd                                                                                  
/root/gyliu/hadoop-docker-master-111                                                                                               
[root@shuynh-gecko1 hadoop-docker-master-111]# docker build -t="sequenceiq/hadoop-cluster-docker:1.1.1" .

[root@shuynh-gecko1 ~]# docker images | grep 1.1.1                                                          
sequenceiq/hadoop-cluster-docker    1.1.1               742b4ff50735        About a minute ago   805.7 MB


3.2 build the images on shuynh-gecko2

[root@shuynh-gecko2 ~]#cd /root/gyliu/hadoop-docker-master-111
[root@shuynh-gecko2 hadoop-docker-master-111]# pwd                                                                                  
/root/gyliu/hadoop-docker-master-111                                                                                               
[root@shuynh-gecko2 hadoop-docker-master-111]# docker build -t="sequenceiq/hadoop-cluster-docker:1.1.1" .

3.3 build the images on shuynh-gecko3

[root@shuynh-gecko3 ~]#cd /root/gyliu/hadoop-docker-master-111
[root@shuynh-gecko3 hadoop-docker-master-111]# pwd                                                                                  
/root/gyliu/hadoop-docker-master-111                                                                                               
[root@shuynh-gecko3 hadoop-docker-master-111]# docker build -t="sequenceiq/hadoop-cluster-docker:1.1.1" .   

4. Configure /etc/hosts, and ssh passwordless file  for each node

#configure /etc/hosts file on every nodes
10.28.241.174 shuynh-gecko1
10.28.241.172 shuynh-gecko2
10.28.241.175 shuynh-gecko3

#ssh passwordless for each node

....

5. Create Hadoop Cluster

# Start a container

docker run   --net=host  sequenceiq/hadoop-cluster-docker:2.4.1 $1 $2 $3
Params definition as below:
$1:Type of Namenode or Datanode, such as N | D
$2:Master Node IP address, such as 10.28.241.174
$3:Default command, run as service "-d", run as interactive "-bash"


#If we need run interactive, please add "-i -t " options.

5.1 Create NameNode on shuynh-gecko1:

[root@shuynh-gecko1 ~]# docker stop $(docker ps -a -q)
[root@shuynh-gecko1 ~]# docker rm $(docker ps -a -q)
[root@shuynh-gecko1 ~]# docker run -i -t --net="host" sequenceiq/hadoop-cluster-docker:1.1.1 N 10.28.241.174 -bash

bash-4.1# jps                                                                                                                       

119 JobTracker                                                                                                                     
596 TaskTracker
528 DataNode                                                                                                                       
389 NameNode

5.2 Create DataNode (backend service, using -d) on shuynh-gecko2:
[root@shuynh-gecko2 ~]# docker stop $(docker ps -a -q)
[root@shuynh-gecko2 ~]# docker rm $(docker ps -a -q)
[root@shuynh-gecko2 hadoop-docker-master-cluster]#docker run  --net="host" sequenceiq/hadoop-cluster-docker:1.1.1 D 10.28.241.174 -d

5.3 Create DataNode (backend service, using -d) on shuynh-gecko3:
[root@shuynh-gecko2 ~]# docker stop $(docker ps -a -q)
[root@shuynh-gecko2 ~]# docker rm $(docker ps -a -q)
[root@shuynh-gecko2 hadoop-docker-master-cluster]# docker run  --net="host" sequenceiq/hadoop-cluster-docker:1.1.1 D 10.28.241.174 -d

6. Check the cluster status

6.1 Access the WEB GUI

Access http://10.28.241.174:50070/dfshealth.html#tab-datanode


6.2 Using command line to check the status

bash-4.1# $HADOOP_PREFIX/bin/hadoop dfsadmin -report

7. Run a sample hadoop case

#create testing data

bash-4.1$HADOOP_PREFIX/bin/hadoop fs -mkdir -p /user/root
bash-4.1$HADOOP_PREFIX/bin/hadoop fs -put $HADOOP_PREFIX/conf/ input
#bash-4.1$HADOOP_PREFIX/bin/hadoop fs  -rm -r output

#run sample testing

bash-4.1$HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/hadoop-examples-1.1.1.jar wordcount input output

#check the output

bash-4.1$HADOOP_PREFIX/bin/hadoop fs -cat output/*



你可能感兴趣的:(hadoop,docker)