2019独角兽企业重金招聘Python工程师标准>>>
一 准备
首先创建文件夹,结构如下:
weim@weim:~/myopt$ ls
ubuntu1 ubuntu2 ubuntu3
并且将下载的jdk(版本:8u172),hadoop(版本:hadoop-2.9.1)解压到三个文件夹中,如下:
weim@weim:~/myopt$ ls ubuntu1
hadoop jdk
weim@weim:~/myopt$ ls ubuntu2
hadoop jdk
weim@weim:~/myopt$ ls ubuntu3
hadoop jdk
二 准备三台机器
这里使用docker创建三台机器,使用镜像ubuntu:16.04
weim@weim:~/myopt$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu 16.04 f975c5035748 2 months ago 112MB
启动三台ubuntu容器,将分别将本地的/myopt/ubuntu1,/myopt/ubuntu2,/myopt/ubuntu3加载到容器的/home/software路径下。
ubuntu1
weim@weim:~/myopt$ docker run --hostname ubuntu1 --name ubuntu1 -v /home/weim/myopt/ubuntu1:/home/software -it --rm ubuntu:16.04 bash
root@ubuntu1:/# ls /home/software/
hadoop jdk
ubuntu2
weim@weim:~/myopt$ docker run --hostname ubuntu2 --name ubuntu2 -v /home/weim/myopt/ubuntu2:/home/software -it --rm ubuntu:16.04 bash
root@ubuntu2:/# ls /home/software/
hadoop jdk
ubuntu3
weim@weim:~/myopt$ docker run --hostname ubuntu3 --name ubuntu3 -v /home/weim/myopt/ubuntu3:/home/software -it --rm ubuntu:16.04 bash
root@ubuntu3:/# ls /home/software/
hadoop jdk
root@ubuntu3:/#
这样最基本的三台机器就创建好了。
查看机器的信息:
weim@weim:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b4c6de2a4326 ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu2
53d1f6389710 ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu3
0f210a01d47f ubuntu:16.04 "bash" About a minute ago Up About a minute ubuntu1
weim@weim:~$
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu1
172.17.0.2
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu2
172.17.0.4
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu3
172.17.0.3
----------------------------------------------------------------------------------
这里是每台机器的ip地址
三台机器在同一个局域网内
----------------------------------------------------------------------------------
三 安装必要的一些软件
在三台机器上安装必要的软件,首先执行apt-get update命令,更新ubuntu软件库。
然后安装软件vim,openssh-server软件即可。
四 环境配置
a 首先配置java环境,在文件下面追加java路径配置
root@ubuntu1:/home/software/jdk# vim /etc/profile
---------------------------------------------------------------
添加下面的配置到profile文件末尾处
#set jdk environment
export JAVA_HOME=/home/software/jdk
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
---------------------------------------------------------------
root@ubuntu1:/home/software/jdk# source /etc/profile
root@ubuntu1:/home/software/jdk# java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
root@ubuntu1:/home/software/jdk#
b 设置ssh无密码访问
root@ubuntu1:/home/software/jdk# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:hSMrNTp6/1d7L/QZGKdTCPivDJspbY2tcyjke2qjpBI root@ubuntu1
The key's randomart image is:
+---[RSA 2048]----+
| . |
| o . |
| + o o . . |
| o + o . o o |
| + . S . * |
| E . o . . .=.. |
| o ..o . @..o..o|
| . .o. * @.*. o..|
| .. .++Xo+ . o.|
+----[SHA256]-----+
root@ubuntu1:/home/software/jdk# cd ~/.ssh
root@ubuntu1:~/.ssh# ls
id_rsa id_rsa.pub
root@ubuntu1:~/.ssh# cat id_rsa.pub >> authorized_keys
root@ubuntu1:~/.ssh# chmod 600 authorized_keys
配置完成之后,通过ssh localhost验证是否可以无密码访问本机,首先确保ssh服务启动了。如果没有启动可以使用/etc/init.d/ssh start 启动服务。
root@ubuntu1:/home/software# /etc/init.d/ssh start
* Starting OpenBSD Secure Shell server sshd [ OK ]
root@ubuntu1:/home/software# ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-41-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
root@ubuntu1:~# exit
logout
Connection to localhost closed.
root@ubuntu1:/home/software#
将authorized_keys文件拷贝到ubuntu2,ubuntu3容器中。(这里呢,我并不知道ubuntu2 root的密码,所以暂时不知道怎么通过scp命令拷贝过去)通过一种折中的方式即可。
首先进入~/.ssh文件下,将authorized_keys文件拷贝到/home/software路径下。
root@ubuntu1:~/.ssh# ls
authorized_keys id_rsa id_rsa.pub known_hosts
root@ubuntu1:~/.ssh# cp authorized_keys /home/software/
root@ubuntu1:~/.ssh# ls /home/software/
authorized_keys hadoop jdk
root@ubuntu1:~/.ssh#
然后回到本地系统,在~/myopt/ubuntu1路径下可以看到刚才拷贝的文件,将该文件拷贝到ubuntu2,ubuntu3中。
weim@weim:~/myopt/ubuntu1$ ls
authorized_keys hadoop jdk
weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu2/
weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu3/
然后分别回到ubuntu2,ubuntu3容器中,将文件拷贝到~/.ssh目录下。
root@ubuntu2:/home/software# cp authorized_keys ~/.ssh
root@ubuntu2:/home/software# ls ~/.ssh
authorized_keys id_rsa id_rsa.pub
root@ubuntu2:/home/software#
验证ubuntu1是否可以无密码访问ubuntu2,ubuntu3(可查看ip通过)
root@ubuntu1:~/.ssh# ssh [email protected]
root@ubuntu1:~/.ssh# ssh [email protected]
五 hadoop 环境配置
我们以ubuntu1为例,2和3都是雷同的。
首先创建hadoop的数据保存目录。
root@ubuntu1:/home/software/hadoop# mkdir data
root@ubuntu1:/home/software/hadoop# cd data/
root@ubuntu1:/home/software/hadoop/data# mkdir tmp
root@ubuntu1:/home/software/hadoop/data# mkdir data
root@ubuntu1:/home/software/hadoop/data# mkdir checkpoint
root@ubuntu1:/home/software/hadoop/data# mkdir name
进入/home/software/hadoop/etc/hadoop目录
修改hadoop-env.sh文件,设置java
export JAVA_HOME=/home/software/jdk
配置core-site.xml
fs.defaultFS
hdfs://172.17.0.2:9000
hadoop.tmp.dir
/home/hadoop/data/tmp
fs.trash.interval
1440
io.file.buffer.size
65536
配置hdfs-site.xml
dfs.replication
3
dfs.namenode.name.dir
/home/hadoop/data/name
dfs.blocksize
67108864
dfs.datanode.data.dir
/home/hadoop/data/data
dfs.namenode.checkpoint.dir
/home/hadoop/data/checkpoint
dfs.namenode.handler.count
10
dfs.datanode.handler.count
10
配置mapred-site.xml
mapreduce.framework.name
yarn
配置yarn-site.xml
yarn.resourcemanager.hostname
172.17.0.2
yarn.nodemanager.aux-services
mapreduce_shuffle
配置slaves
172.17.0.2
172.17.0.3
172.17.0.4
六 启动
在ubuntu1中,进入/home/software/hadoop/bin目录,执行hdfs namenode -format 初始化hdfs
root@ubuntu1:/home/software/hadoop/bin# ./hdfs namenode -format
在ubuntu1中,进入/home/software/hadoop/sbin目录,
执行start-all.sh
root@ubuntu1:/home/software/hadoop/sbin# ./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [ubuntu1]
The authenticity of host 'ubuntu1 (172.17.0.2)' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu1: Warning: Permanently added 'ubuntu1,172.17.0.2' (ECDSA) to the list of known hosts.
ubuntu1: starting namenode, logging to /home/software/hadoop/logs/hadoop-root-namenode-ubuntu1.out
172.17.0.2: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu1.out
172.17.0.4: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu2.out
172.17.0.3: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu3.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/software/hadoop/logs/hadoop-root-secondarynamenode-ubuntu1.out
starting yarn daemons
starting resourcemanager, logging to /home/software/hadoop/logs/yarn--resourcemanager-ubuntu1.out
172.17.0.2: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu1.out
172.17.0.3: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu3.out
172.17.0.4: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu2.out
查看启动情况
ubuntu1
root@ubuntu1:/home/software/hadoop/sbin# jps
3827 SecondaryNameNode
3686 DataNode
4007 ResourceManager
4108 NodeManager
4158 Jps
ubuntu2
root@ubuntu2:/home/software/hadoop/sbin# jps
3586 Jps
3477 DataNode
3545 NodeManager
ubuntu3
root@ubuntu3:/home/software/hadoop/sbin# jps
3472 DataNode
3540 NodeManager
3582 Jps
接下来我们访问http://172.17.0.2:50070 和http://172.17.0.2:8088就可以看到一些信息了。