一、集群模式
Hadoop对应的Java进程运行在多台物理机器上,称为集群模式。集群就是有主有从
。
完全分布式模式才是生产环境采用的模式,Hadoop运行在服务器集群上。
二、网络规划
服务器 | IP地址 | 软件 | 服务 | 备注 |
---|---|---|---|---|
master | 192.168.71.130 | Hadoop | DataNode、NodeManager、 NameNode、resourcemanager JobHistoryServer |
主机 |
slave1 | 192.168.71.129 | Hadoop | DataNode、NodeManager、secondarynamenode | 从机 |
slave2 | 192.168.71.132 | Hadoop | DataNode、NodeManager | 从机 |
三、集群安装配置
- 修改主机名
# 查看主机名
# 修改hosts文件
192.168.71.130 master
192.168.71.129 slave1
192.168.71.132 slave2
- 集群之间的SSH无密码登录
# 每台主机生成ssh公私钥,如果存在就不需要再生成了。
[root@master ~]# cd ~/.ssh/
# 全部回车就可以
[root@master .ssh]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:fclVE1mO3ZLTKn1/E5uLz9utuqjaaCJqH9kjdMR6bQE root@slave2
The key's randomart image is:
+---[RSA 2048]----+
| E o*|
| . . Oo|
| o . * =|
| o . .. . + + |
| o o oS . = o..|
| . = . . . .=|
| + o +o|
| .. + oo . o *|
|o..o oo.o.. o+o*+|
+----[SHA256]-----+
具体步骤(以master无密码登录slaver为例):
首先将master生成的公匙用scp命令传到所有的slaver上(以下命令是在master上执行)
# 加入授权
[root@master .ssh]# cat id_rsa.pub >> authorized_keys
# 修改文件权限,如果不改,无法通过
[root@master .ssh]# chmod 600 ./authorized_keys
# 拷贝公钥到其它主机
root@masterhadoop /]# ssh-copy-id slave1
# 无密登录测试
root@master:~/.ssh# ssh slave1
Welcome to Ubuntu 18.04 LTS (GNU/Linux 4.15.0-20-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Tue Nov 26 09:45:23 CST 2019
System load: 0.04 Processes: 164
Usage of /: 2.3% of 77.30GB Users logged in: 2
Memory usage: 12% IP address for ens33: 192.168.71.132
Swap usage: 0%
* Overheard at KubeCon: "microk8s.status just blew my mind".
https://microk8s.io/docs/commands#microk8s.status
249 packages can be updated.
136 updates are security updates.
Last login: Tue Nov 26 09:38:05 2019 from 192.168.71.130
root@slave2:~#
# 退出登录
root@slave2:~# exit
- Master配置
1)core-site.xml
fs.default.name
hdfs://master:9000
hadoop.tmp.dir
/hadoop/tmp
hadoop.proxyuser.root.hosts
*
hadoop.proxyuser.root.groups
*
2)hdfs-site.xml
dfs.namenode.http-address
master:50070
dfs.namenode.secondary.http-address
slave1:50090
dfs.namenode.name.dir
/hadoop/tmp/name
dfs.datanode.data.dir
/hadoop/tmp/data
dfs.replication
3
dfs.permissions
false
3) mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
master:10020
mapreduce.jobhistory.webapp.address
master:19888
4)yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.resourcemanager.hostname
master
- Slave配置
# 配置slaves
root@master:~# vi /usr/local/hadoop-2.9.2/etc/hadoop/slaves
# 内容
master
slave1
slave2
- 文件分发
# JDK环境安装,复制jdk压缩文件到其它主机
root@master:~# scp jdk-8u231-linux-x64.tar.gz root@slave1:/root
# 解压JDk
root@slave1:~# tar -zxvf jdk-8u231-linux-x64.tar.gz -C /usr/local/
# 复制环境变量
root@master:~# scp /etc/profile root@slave1:/etc/
# 文件压缩
root@master:/usr/local# tar -cvf hadoop.tar ./hadoop-2.9.2/
# 文件传输
root@master:/usr/local# scp hadoop.tar root@slave1:/root
# 文件解压
root@master:/usr/local# ssh slave1
root@slave1:~# tar -xvf hadoop.tar -C /usr/local
四、启动测试
- 格式化名称节点(所有节点创建目录 /hadoop/tmp)
# 删除原目录文件
root@master:~# rm -rf /hadoop/tmp/
root@master:~# mkdir /hadoop/tmp
root@master:/usr/local/hadoop-2.9.2# hdfs namenode -format
19/11/26 13:16:29 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/192.168.71.130
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.9.2
注意:
如果需要重新格式化NameNode,需要先将原来 NameNode 和 DataNode下的文件全部删除,不然会报错,NameNode 和 DataNode 所在目录是在 core-site.xml 中 hadoop.tmp.dir、dfs.namenode.name.dir、 dfs.datanode.data.dir 属性配置的。
因为每次格式化,默认是创建一个集群 ID,并写入 NameNode 和 DataNode 的 VERSION 文件中(VERSION 文件所 在目录为 dfs/name/current 和 dfs/data/current),重新格式化时,默认会生成一个新的集群 ID,如果不删 除原来的目录,会导致 namenode 中的 VERSION 文件中是新的集群 ID,而 DataNode 中是旧的集群 ID,不一致时 会报错。
另一种方法是格式化时指定集群 ID 参数,指定为旧的集群 ID。
- 启动HDFS进程
# 启动HDFS进程
root@master:/usr/local/hadoop-2.9.2# start-dfs.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-namenode-master.out
slave2: starting datanode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-datanode-slave2.out
master: starting datanode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-datanode-master.out
slave1: starting datanode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-datanode-slave1.out
Starting secondary namenodes [slave1]
slave1: starting secondarynamenode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-secondarynamenode-slave1.out
# 查看启动情况master
root@master:/usr/local/hadoop-2.9.2# jps
43572 DataNode
44596 Jps
7949 SecondaryNameNode
43261 NameNode
# 查看启动情况slave1
root@slave1:~# jps
86177 SecondaryNameNode
85987 DataNode
86613 Jps
root@slave2:~# jps
27953 Jps
27450 DataNode
- 启动计算平台
root@master:/usr/local/hadoop-2.9.2# ./sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.9.2/logs/yarn-root-resourcemanager-master.out
slave2: starting nodemanager, logging to /usr/local/hadoop-2.9.2/logs/yarn-root-nodemanager-slave2.out
slave1: starting nodemanager, logging to /usr/local/hadoop-2.9.2/logs/yarn-root-nodemanager-slave1.out
master: starting nodemanager, logging to /usr/local/hadoop-2.9.2/logs/yarn-root-nodemanager-master.out
# 查看不同主机
root@master:/usr/local/hadoop-2.9.2# jps
96273 DataNode
97173 ResourceManager
97511 NodeManager
97675 Jps
96013 NameNode
7949 SecondaryNameNode
root@slave1:~# jps
86177 SecondaryNameNode
85987 DataNode
87155 Jps
86935 NodeManager
root@slave2:~# jps
28289 NodeManager
27450 DataNode
28526 Jps
- 启动jobhistoryserver
# 来实现web查看作业的历史运行情况
root@master:/usr/local/hadoop-2.9.2# ./sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /usr/local/hadoop-2.9.2/logs/mapred-root-historyserver-master.out
root@master:/usr/local/hadoop-2.9.2# jps
96273 DataNode
97173 ResourceManager
97511 NodeManager
98983 JobHistoryServer
96013 NameNode
99180 Jps
7949 SecondaryNameNode
# 重启jobhistoryserver
root@master:/usr/local/hadoop-2.9.2# ./sbin/mr-jobhistory-daemon.sh stop historyserver
root@master:/usr/local/hadoop-2.9.2# ./sbin/mr-jobhistory-daemon.sh start historyserver
- 查看 DataNode 是否正常启动
# 查看 DataNode 是否正常启动
root@master:/usr/local/hadoop-2.9.2# hdfs dfsadmin -report
Configured Capacity: 248992088064 (231.89 GB)
Present Capacity: 222483304448 (207.20 GB)
DFS Remaining: 222483230720 (207.20 GB)
DFS Used: 73728 (72 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 192.168.71.129:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 82997362688 (77.30 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 4279054336 (3.99 GB)
DFS Remaining: 74458132480 (69.34 GB)
DFS Used%: 0.00%
DFS Remaining%: 89.71%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Nov 26 13:22:21 CST 2019
Last Block Report: Tue Nov 26 13:18:57 CST 2019
Name: 192.168.71.130:50010 (master)
Hostname: master
Decommission Status : Normal
Configured Capacity: 82997362688 (77.30 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 5163393024 (4.81 GB)
DFS Remaining: 73573793792 (68.52 GB)
DFS Used%: 0.00%
DFS Remaining%: 88.65%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Nov 26 13:22:22 CST 2019
Last Block Report: Tue Nov 26 13:16:58 CST 2019
Name: 192.168.71.132:50010 (slave2)
Hostname: slave2.localdomain
Decommission Status : Normal
Configured Capacity: 82997362688 (77.30 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 4285882368 (3.99 GB)
DFS Remaining: 74451304448 (69.34 GB)
DFS Used%: 0.00%
DFS Remaining%: 89.70%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Nov 26 13:22:22 CST 2019
Last Block Report: Tue Nov 26 13:16:58 CST 2019
配置好后,将Master上的 /hadoop 文件夹复制到各个节点上(缺少任一进程都表示出错。另外还需要在 Master 节点上通过命令 hdfs dfsadmin -report 查看 DataNode 是否正常启动,如果 Live datanodes 不为 0 ,则说明集群启动成功。
-
web测试
作业服务:
http://192.168.71.130:8088/
HDFS服务:
http://192.168.71.130:50070/
历史服务:
http://192.168.71.130:19888/
- 文件上传
# 创建目录
root@master:/usr/local/hadoop-2.9.2# hadoop fs -mkdir -p /user/hadoop
# 查看创建目录
root@master:/usr/local/hadoop-2.9.2# hadoop fs -ls / -la
Found 2 items
drwxrwx--- - root supergroup 0 2019-11-26 13:19 /tmp
drwxr-xr-x - root supergroup 0 2019-11-26 13:24 /user
ls: `-la': No such file or directory
# 文件上传
root@master:~# hadoop fs -put hadoop-2.9.2.tar.gz /user/hadoop
# 文件查看
root@master:~# hadoop fs -ls /data/test
Found 1 items
-rw-r--r-- 3 root supergroup 366447449 2019-11-26 13:52 /data/test/hadoop-2.9.2.tar.gz
五、日志聚集
主要用来统一MapperReducer作业日志与HDFS无关。
- 开启日志聚集
MapReduce是在各个机器上运行的,在运行过程中产生的日志存在于各个机器上,为了能够统一查看各 个机器的运行日志,将日志集中存放在 HDFS 上,这个过程就是日志聚集。
Hadoop默认是不启用日志聚集的。 - 配置yarn-site.xml
在 yarn-site.xml 文件里配置启用日志聚集。
yarn.log-aggregation-enable
true
yarn.log-aggregation.retain-seconds
106800
yarn.log-aggregation-enable:是否启用日志聚集功能。
yarn.log-aggregation.retain-seconds:设置日志保留时间,单位是秒。
将配置文件分发到其他节点
root@master:~# scp /usr/local/hadoop-2.9.2/etc/hadoop/yarn-site.xml root@slave2:/usr/local/hadoop-2.9.2/etc/hadoop/
- 测试日志聚集
运行一个 demo MapReduce,使之产生日志
root@master:~# hadoop jar hadoop-mapreduce-examples-2.9.2.jar wordcount hdfs://192.168.71.130:9000/wordcount.txt hdfs://192.168.71.130:9000/reslut
查看日志,没有看出和没有开启支区别。