Hadoop 3.1.0 集群搭建

Hadoop 3.1.0 集群搭建

下载地址

http://mirrors.hust.edu.cn/apache/hadoop/common/

文档地址

http://hadoop.apache.org/docs/r3.1.0/hadoop-project-dist/hadoop-common/SingleCluster.html

参考地址

https://my.oschina.net/orrin/blog/1816023

配置hosts

vim /etc/hosts

关闭防火墙

systemctl stop firewalld && systemctl disable firewalld

setenforce 0

vim /etc/selinux/config

SELINUX=disabled

reboot

配置SSH(主节点)

 ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

 chmod 0600 ~/.ssh/authorized_keys

拷贝至子节点

scp /root/.ssh/id_rsa.pub root@子节点主机或者IP:~

子节点

mkdir -p ~/.ssh

cd ~/.ssh/

cat ~/id_rsa.pub >> authorized_keys

vim /etc/ssh/sshd_config

#如果是用root用户登录请开启
PermitRootLogin yes

测试SSH

ssh localhost

ssh 子节点名称或者ip地址

配置JDK

  • 下载安装JDK

  • 配置环境变量

    export JAVA_HOME=/usr/local/jdk1.8.0_181
    
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    
    export JRE_HOME=$JAVA_HOME/jre
    
    PATH=$JAVA_HOME/bin:$PATH
    

下载安装 Hadoop

  • 解压

  • 配置环境变量

    export HADOOP_HOME=/usr/local/hadoop-3.1.0
    
    PATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    

使环境变量生效

    source /etc/profile

修改Hadoopt配置文件

  • hadoop-env.sh

  • core-site.xml

  • hdfs-site.xml

    
    
        dfs.namenode.http-address
        node1.spark:50070
    
    
    
    
        dfs.namenode.secondary.http-address
        node2.spark:50090
    
    
    
    
        dfs.namenode.name.dir
        /opt/hadoop/data/name
    
    
    
    
        dfs.replication
        2
    
    
    
        dfs.datanode.data.dir
        /opt/hadoop/data/datanode
    
    
    
        dfs.permissions
        false
    
    

  • mapred-site.xml

    
        
        
            mapreduce.framework.name
            yarn
        
          
            mapreduce.application.classpath  
              
            /opt/hadoop/hadoop-3.1.0/etc/hadoop,  
            /opt/hadoop/hadoop-3.1.0/share/hadoop/common/*,  
            /opt/hadoop/hadoop-3.1.0/share/hadoop/common/lib/*,  
            /opt/hadoop/hadoop-3.1.0/share/hadoop/hdfs/*,  
            /opt/hadoop/hadoop-3.1.0/share/hadoop/hdfs/lib/*,  
            /opt/hadoop/hadoop-3.1.0/share/hadoop/mapreduce/*,  
            /opt/hadoop/hadoop-3.1.0/share/hadoop/mapreduce/lib/*,  
            /opt/hadoop/hadoop-3.1.0/share/hadoop/yarn/*,  
            /opt/hadoop/hadoop-3.1.0/share/hadoop/yarn/lib/*  
              
        
    
    
  • yarn-site.xml

    
          
            yarn.nodemanager.aux-services  
            mapreduce_shuffle  
          
          
            yarn.nodemanager.aux-services.mapreduce.shuffle.class  
            org.apache.hadoop.mapred.ShuffleHandle  
          
          
            yarn.resourcemanager.resource-tracker.address  
            node1.spark:8025  
          
          
            yarn.resourcemanager.scheduler.address  
            node1.spark:8030  
          
          
            yarn.resourcemanager.address  
            node1.spark:8040  
          
    
    

修改 start-dfs.sh sbin/stop-dfs.sh

sbin/start-dfs.sh sbin/stop-dfs.sh

HDFS_DATANODE_USER=root 
HADOOP_SECURE_DN_USER=hdfs 
HDFS_NAMENODE_USER=root 
HDFS_SECONDARYNAMENODE_USER=root 

修改 start-yarn.sh sbin/stop-yarn.sh

sbin/start-yarn.sh sbin/stop-yarn.sh

YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn 
YARN_NODEMANAGER_USER=root

masters

新建一个masters的文件,这里指定的是secondary namenode 的主机

workers

添加子节点的主机名称

拷贝至其他节点

scp -r 主节点Hadoop安装路径  子节点主机名或者IP:子节点路径

修改环境变量和主节点一样

第一次启动格式化

hdfs namenode -format

启动 start-all.sh

访问

http://localhost:8088

http://localhost:50070

停止

stop-all.sh

遇到的错误

  • 注意拷贝到子节点的时候 ,注意路径要与配置的保持一直

  • SSH 配置

  • -

你可能感兴趣的:(Hadoop,大数据)