大数据入门学习笔记(柒)- Hadoop分布式集群搭建

环境介绍

版本为cdh5.7.0-2.6.0
先准备好三台虚拟机或者服务器

hadoop000:192.168.199.102
hadoop001:192.168.199.247
hadoop002:192.168.199.138

  1. 三台都要更改hostname;每台设备不同内容:
    hostname设置: sudo vi /etc/sysconfig/network
    比如hadoop001:192.168.199.247设备
NETWORKING=yes
HOSTNAME=hadoop001
  1. 三台都要更改映射,每台设备相同内容:
    hostname和ip地址的设置: sudo vi /etc/hosts
192.168.199.102 hadoop000
192.168.199.247 hadoop001
192.168.199.138 hadoop002
  1. 设备分配
    各节点角色分配:
    hadoop000: NameNode/DataNode ResourceManager/NodeManager
    hadoop001: DataNode NodeManager
    hadoop002: DataNode NodeManager

前置配置

  1. ssh免密码登陆
    在每台机器上运行:ssh-keygen -t rsa 一路回车
    hadoop000:
    大数据入门学习笔记(柒)- Hadoop分布式集群搭建_第1张图片
    hadoop001;hadoop002同上图
  2. 以hadoop000机器为主依次执行
    ssh-copy-id -i ~/.ssh/id.rsa. pub hadoop000
    ssh-copy-id -i ~/.ssh/id.rsa. pub hadoop001
    ssh-copy-id -i ~/.ssh/id.rsa. pub hadoop002
    大数据入门学习笔记(柒)- Hadoop分布式集群搭建_第2张图片
    验证
    大数据入门学习笔记(柒)- Hadoop分布式集群搭建_第3张图片
  3. JDK安装
    在hadoopooo机器上解压jdk安装包,并设置JAVA-HOME到系统环境变量
    解压:tar -zxvf jdk-7u79-linux-x64.tar.gz -C ~/app
    添加到系统环境变量: ~/.bash_profile
    export JAVA_HOME=/home/hadoop/app/jdk1.7.0_79
    export PATH=$JAVA_HOME$/bin:$PATH
    使得环境变量生效: source ~/.bash_profile
    验证java是否配置成功: java -v

Hadoop环境配置分发

集群安装

  1. Hadoop安装
    在hadoop@oo机器上解压Hadoop安装包,并设置HADOOP-HOME到系统环境变量
    export HADOOP HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
    export PATH=$SHADOOP HOME$/bin: $PATH

  2. hadoop配置文件的修改(hadoop_home/etc/hadoop)

hadoop-env.sh

export JAVA_HOME=/home/hadoop/app/jdk1.7.0_79

core-site.xml

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://hadoop000:8020</value>
</property>

hdfs-site.xml
前提下列目录已经创建

<property>
    <name>dfs.namenode.name.dir</name>
    <value>/home/hadoop/app/tmp/dfs/name</value>
</property>

<property>
    <name>dfs.namenode.data.dir</name>
    <value>/home/hadoop/app/tmp/dfs/data</value>
</property>

yarn-site.xml

<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop000</value>
</property>

mapred-site.xml

<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>

slaves

hadoop000
hadoop001
hadoop002
  1. 在hadoop000上分发安装包到hadoop001和hadoop002节点
scp -r ~/app hadoop@hadoop001:~/
scp -r ~/app hadoop@hadoop002:~/
scp ~/.bash_profile hadoop@hadoop001:~/
scp ~/.bash_profile hadoop@hadoop002:~/

在hadoop001和hadoop002机器上让. bash_profile生效

  1. 启动

格式化文件系统(仅第一次执行即可,不要重复执行):只要在hadoop000上执行即可
bin/hdfs namenode -format

启动集群:只要在hadoop000上执行即可
hdfs: sbin/start-all.sh

验证是否启动成功:
jps
 hadoop000
  DataNode
  SecondaryNameNode
  NameNode
  ResourceManager
  NodeManager
  
 hadoop001
  DataNode
  NodeManager
  
 hadoop002
  DataNode
  NodeManager
  
浏览器访问方式
 http://hadoop000:50070
 http://hadoop000:8088

  1. 停止YARN相关的进程
    sbin/stop-all.sh

你可能感兴趣的:(Hadoop学习笔记)