搭建Hadoop完全分布式环境

前提:搭建Hadoop伪分布式环境

node01 node02 node03 node04
NameNode SecondaryNameNode
DataNode01 DataNode02 DataNode03
  1. 配置node01、node02、node03、node04上的免密钥登录

在node02、node03、node04上生成密钥:
ssh-keygen
保证node01、node02、node03、node04上都有四台主机的公钥:
ssh-copy-id -i ~/.ssh/id_rsa.pub node01

  1. 安装node02、node03、node04上的Hadoop和JDK

tar -zxvf hadoop-3.1.1.tar.gz -C /opt/hadoop/
rpm -ivh jdk-8u172-linux-x64.rpm

  1. 配置node02、node03、node04上的环境变量

将node01上的/etc/profile拷贝到node02、node03、node04:
scp /etc/profile node02:/etc/ && scp /etc/profile node03:/etc/ && scp /etc/profile node04:/etc/
在node02、node03、node04上运行:
. /etc/profile

  1. 配置node01、node02、node03、node04上的Hadoop

在node01上修改/opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml
vim /opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml
添加:


  
  
    fs.defaultFS
    hdfs://node01:9000
  
  
    hadoop.tmp.dir
    /opt/hadoop/data/tmp/full
  

在node01上修改/opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml
vim /opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml
添加:

 
  
    dfs.replication
    2
  
  
    dfs.namenode.secondary.http-address
    node02:9868
  

在node01上修改/opt/hadoop/hadoop-3.1.1/etc/hadoop/workers
vim /opt/hadoop/hadoop-3.1.1/etc/hadoop/workers
添加:

node02
node03
node04

将node01上的/opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml/opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml/opt/hadoop/hadoop-3.1.1/etc/hadoop/workers拷贝到node02、node03、node04:
scp /opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml /opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml /opt/hadoop/hadoop-3.1.1/etc/hadoop/workers node02:/opt/hadoop/hadoop-3.1.1/etc/hadoop/ && scp /opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml /opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml /opt/hadoop/hadoop-3.1.1/etc/hadoop/workers node03:/opt/hadoop/hadoop-3.1.1/etc/hadoop/ && scp /opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml /opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml /opt/hadoop/hadoop-3.1.1/etc/hadoop/workers node04:/opt/hadoop/hadoop-3.1.1/etc/hadoop/

  1. 格式化Hadoop

在node01上运行:
hdfs namenode -format

  1. 启动Hadoop

在node01/node02/node03/node04上运行:
start-dfs.sh

  1. 查看进程

在node01、node02、node03、node04上运行:
jps

  1. 访问网页

NameNode:http://192.168.163.191:9870
SecondaryNameNode:http://192.168.163.192:9868
DataNode01:http://192.168.163.192:9864
DataNode02:http://192.168.163.193:9864
DataNode03:http://192.168.163.194:9864

你可能感兴趣的:(搭建Hadoop完全分布式环境)