Hadoop运维--完全分布式

一、集群模式

Hadoop对应的Java进程运行在多台物理机器上,称为集群模式。集群就是有主有从
完全分布式模式才是生产环境采用的模式,Hadoop运行在服务器集群上。

二、网络规划

服务器 IP地址 软件 服务 备注
master 192.168.71.130 Hadoop DataNode、NodeManager、
NameNode、resourcemanager
JobHistoryServer
主机
slave1 192.168.71.129 Hadoop DataNode、NodeManager、secondarynamenode 从机
slave2 192.168.71.132 Hadoop DataNode、NodeManager 从机
网络规划

三、集群安装配置

  1. 修改主机名
#  查看主机名

# 修改hosts文件
192.168.71.130 master
192.168.71.129 slave1
192.168.71.132 slave2

  1. 集群之间的SSH无密码登录
# 每台主机生成ssh公私钥,如果存在就不需要再生成了。
[root@master ~]# cd ~/.ssh/

# 全部回车就可以 
[root@master .ssh]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:fclVE1mO3ZLTKn1/E5uLz9utuqjaaCJqH9kjdMR6bQE root@slave2
The key's randomart image is:
+---[RSA 2048]----+
|     E         o*|
|    . .        Oo|
|     o .      * =|
|    o . .. . + + |
|   o o oS . = o..|
|  . = .    . . .=|
|   + o         +o|
| .. + oo   .  o *|
|o..o oo.o.. o+o*+|
+----[SHA256]-----+


具体步骤(以master无密码登录slaver为例):
首先将master生成的公匙用scp命令传到所有的slaver上(以下命令是在master上执行)

# 加入授权
[root@master .ssh]# cat id_rsa.pub >> authorized_keys

# 修改文件权限,如果不改,无法通过
[root@master .ssh]# chmod 600 ./authorized_keys
# 拷贝公钥到其它主机
root@masterhadoop /]# ssh-copy-id slave1

# 无密登录测试
root@master:~/.ssh# ssh slave1
Welcome to Ubuntu 18.04 LTS (GNU/Linux 4.15.0-20-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage
  System information as of Tue Nov 26 09:45:23 CST 2019
  System load:  0.04              Processes:            164
  Usage of /:   2.3% of 77.30GB   Users logged in:      2
  Memory usage: 12%               IP address for ens33: 192.168.71.132
  Swap usage:   0%
 * Overheard at KubeCon: "microk8s.status just blew my mind".
     https://microk8s.io/docs/commands#microk8s.status
249 packages can be updated.
136 updates are security updates.

Last login: Tue Nov 26 09:38:05 2019 from 192.168.71.130
root@slave2:~# 
# 退出登录
root@slave2:~# exit

  1. Master配置
    1)core-site.xml

        
        fs.default.name
        hdfs://master:9000
        
        
                hadoop.tmp.dir
                /hadoop/tmp
        
        
                 hadoop.proxyuser.root.hosts
                 *
         
            
                 hadoop.proxyuser.root.groups
                 *
         


2)hdfs-site.xml


        
        
                dfs.namenode.http-address
                 master:50070
         
         
         
                 dfs.namenode.secondary.http-address
                 slave1:50090
         
         
         
                 dfs.namenode.name.dir
                 /hadoop/tmp/name
         
        
         
                dfs.datanode.data.dir
                /hadoop/tmp/data
        
        
                dfs.replication
                3
        
        
                dfs.permissions
                false
        
          

3) mapred-site.xml


 

        mapreduce.framework.name
        yarn

 

        mapreduce.jobhistory.address
        master:10020

 

         mapreduce.jobhistory.webapp.address
         master:19888
 


4)yarn-site.xml



         yarn.nodemanager.aux-services
         mapreduce_shuffle


     
         yarn.resourcemanager.hostname
         master

  1. Slave配置
# 配置slaves
root@master:~# vi /usr/local/hadoop-2.9.2/etc/hadoop/slaves 
# 内容
master
slave1
slave2
                                  
  1. 文件分发
# JDK环境安装,复制jdk压缩文件到其它主机
root@master:~# scp jdk-8u231-linux-x64.tar.gz  root@slave1:/root
# 解压JDk
root@slave1:~# tar -zxvf jdk-8u231-linux-x64.tar.gz -C /usr/local/

# 复制环境变量
root@master:~# scp /etc/profile root@slave1:/etc/

# 文件压缩
root@master:/usr/local# tar -cvf hadoop.tar ./hadoop-2.9.2/
# 文件传输
root@master:/usr/local# scp hadoop.tar root@slave1:/root
# 文件解压
root@master:/usr/local# ssh slave1
root@slave1:~# tar -xvf hadoop.tar -C /usr/local

四、启动测试

  1. 格式化名称节点(所有节点创建目录 /hadoop/tmp)
# 删除原目录文件
root@master:~# rm -rf /hadoop/tmp/
root@master:~# mkdir /hadoop/tmp

root@master:/usr/local/hadoop-2.9.2# hdfs namenode -format
19/11/26 13:16:29 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/192.168.71.130
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.9.2

注意:
  如果需要重新格式化NameNode,需要先将原来 NameNode 和 DataNode下的文件全部删除,不然会报错,NameNode 和 DataNode 所在目录是在 core-site.xml 中 hadoop.tmp.dir、dfs.namenode.name.dir、 dfs.datanode.data.dir 属性配置的。
  因为每次格式化,默认是创建一个集群 ID,并写入 NameNode 和 DataNode 的 VERSION 文件中(VERSION 文件所 在目录为 dfs/name/current 和 dfs/data/current),重新格式化时,默认会生成一个新的集群 ID,如果不删 除原来的目录,会导致 namenode 中的 VERSION 文件中是新的集群 ID,而 DataNode 中是旧的集群 ID,不一致时 会报错。
  另一种方法是格式化时指定集群 ID 参数,指定为旧的集群 ID。

  1. 启动HDFS进程
# 启动HDFS进程
root@master:/usr/local/hadoop-2.9.2# start-dfs.sh
Starting namenodes on [master]
master: starting namenode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-namenode-master.out
slave2: starting datanode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-datanode-slave2.out
master: starting datanode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-datanode-master.out
slave1: starting datanode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-datanode-slave1.out
Starting secondary namenodes [slave1]
slave1: starting secondarynamenode, logging to /usr/local/hadoop-2.9.2/logs/hadoop-root-secondarynamenode-slave1.out

# 查看启动情况master
root@master:/usr/local/hadoop-2.9.2# jps
43572 DataNode
44596 Jps
7949 SecondaryNameNode
43261 NameNode

# 查看启动情况slave1
root@slave1:~# jps
86177 SecondaryNameNode
85987 DataNode
86613 Jps

root@slave2:~# jps
27953 Jps
27450 DataNode

  1. 启动计算平台
root@master:/usr/local/hadoop-2.9.2# ./sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.9.2/logs/yarn-root-resourcemanager-master.out
slave2: starting nodemanager, logging to /usr/local/hadoop-2.9.2/logs/yarn-root-nodemanager-slave2.out
slave1: starting nodemanager, logging to /usr/local/hadoop-2.9.2/logs/yarn-root-nodemanager-slave1.out
master: starting nodemanager, logging to /usr/local/hadoop-2.9.2/logs/yarn-root-nodemanager-master.out

# 查看不同主机
root@master:/usr/local/hadoop-2.9.2# jps
96273 DataNode
97173 ResourceManager
97511 NodeManager
97675 Jps
96013 NameNode
7949 SecondaryNameNode

root@slave1:~# jps
86177 SecondaryNameNode
85987 DataNode
87155 Jps
86935 NodeManager

root@slave2:~# jps
28289 NodeManager
27450 DataNode
28526 Jps

  1. 启动jobhistoryserver
# 来实现web查看作业的历史运行情况
root@master:/usr/local/hadoop-2.9.2# ./sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /usr/local/hadoop-2.9.2/logs/mapred-root-historyserver-master.out
root@master:/usr/local/hadoop-2.9.2# jps
96273 DataNode
97173 ResourceManager
97511 NodeManager
98983 JobHistoryServer
96013 NameNode
99180 Jps
7949 SecondaryNameNode

# 重启jobhistoryserver
root@master:/usr/local/hadoop-2.9.2# ./sbin/mr-jobhistory-daemon.sh stop historyserver
root@master:/usr/local/hadoop-2.9.2# ./sbin/mr-jobhistory-daemon.sh start historyserver

  1. 查看 DataNode 是否正常启动
# 查看 DataNode 是否正常启动
root@master:/usr/local/hadoop-2.9.2# hdfs dfsadmin -report
Configured Capacity: 248992088064 (231.89 GB)
Present Capacity: 222483304448 (207.20 GB)
DFS Remaining: 222483230720 (207.20 GB)
DFS Used: 73728 (72 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 192.168.71.129:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 82997362688 (77.30 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 4279054336 (3.99 GB)
DFS Remaining: 74458132480 (69.34 GB)
DFS Used%: 0.00%
DFS Remaining%: 89.71%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Nov 26 13:22:21 CST 2019
Last Block Report: Tue Nov 26 13:18:57 CST 2019


Name: 192.168.71.130:50010 (master)
Hostname: master
Decommission Status : Normal
Configured Capacity: 82997362688 (77.30 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 5163393024 (4.81 GB)
DFS Remaining: 73573793792 (68.52 GB)
DFS Used%: 0.00%
DFS Remaining%: 88.65%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Nov 26 13:22:22 CST 2019
Last Block Report: Tue Nov 26 13:16:58 CST 2019


Name: 192.168.71.132:50010 (slave2)
Hostname: slave2.localdomain
Decommission Status : Normal
Configured Capacity: 82997362688 (77.30 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 4285882368 (3.99 GB)
DFS Remaining: 74451304448 (69.34 GB)
DFS Used%: 0.00%
DFS Remaining%: 89.70%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Nov 26 13:22:22 CST 2019
Last Block Report: Tue Nov 26 13:16:58 CST 2019


配置好后,将Master上的 /hadoop 文件夹复制到各个节点上(缺少任一进程都表示出错。另外还需要在 Master 节点上通过命令 hdfs dfsadmin -report 查看 DataNode 是否正常启动,如果 Live datanodes 不为 0 ,则说明集群启动成功。

  1. web测试
    作业服务:
    http://192.168.71.130:8088/


    作业服务

HDFS服务:
http://192.168.71.130:50070/


HDFS服务

历史服务:
http://192.168.71.130:19888/


历史服务
  1. 文件上传
# 创建目录
root@master:/usr/local/hadoop-2.9.2# hadoop fs -mkdir -p /user/hadoop
# 查看创建目录
root@master:/usr/local/hadoop-2.9.2# hadoop fs -ls / -la
Found 2 items
drwxrwx---   - root supergroup          0 2019-11-26 13:19 /tmp
drwxr-xr-x   - root supergroup          0 2019-11-26 13:24 /user
ls: `-la': No such file or directory

# 文件上传
root@master:~# hadoop fs -put hadoop-2.9.2.tar.gz /user/hadoop

# 文件查看
root@master:~# hadoop fs -ls /data/test
Found 1 items
-rw-r--r--   3 root supergroup  366447449 2019-11-26 13:52 /data/test/hadoop-2.9.2.tar.gz

五、日志聚集

主要用来统一MapperReducer作业日志与HDFS无关。


查看工作日志

查看工作日志

查看工作日志
  1. 开启日志聚集
      MapReduce是在各个机器上运行的,在运行过程中产生的日志存在于各个机器上,为了能够统一查看各 个机器的运行日志,将日志集中存放在 HDFS 上,这个过程就是日志聚集。
    Hadoop默认是不启用日志聚集的。
  2. 配置yarn-site.xml
    在 yarn-site.xml 文件里配置启用日志聚集。

    yarn.log-aggregation-enable
    true


    yarn.log-aggregation.retain-seconds
    106800

yarn.log-aggregation-enable:是否启用日志聚集功能。
yarn.log-aggregation.retain-seconds:设置日志保留时间,单位是秒。

将配置文件分发到其他节点

root@master:~# scp /usr/local/hadoop-2.9.2/etc/hadoop/yarn-site.xml  root@slave2:/usr/local/hadoop-2.9.2/etc/hadoop/

  1. 测试日志聚集
    运行一个 demo MapReduce,使之产生日志
root@master:~# hadoop jar hadoop-mapreduce-examples-2.9.2.jar wordcount hdfs://192.168.71.130:9000/wordcount.txt hdfs://192.168.71.130:9000/reslut

查看日志,没有看出和没有开启支区别。


开启聚集日志后结果

你可能感兴趣的:(Hadoop运维--完全分布式)