Centos6.8搭建Hadoop集群

Hadoop下载

  • Hadoop官网

    Hadoop官网
  • 选择的版本

    下载连接

Hadoop安装配置

  • 准备内容

    1.虚拟机3台(centos6.9)

    2.配置在同一网段

  • 服务器配置

    1.配置hosts(三台机子同样配置)
    192.168.0.101 node1
    192.168.0.102 node2
    192.168.0.103 node3
    
    2.配置JDK1.8(/etc/profile)
    JAVA_HOME=/usr/java/jdk1.8.0_171/
    PATH=$JAVA_HOME/bin:$PATH
    CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    export JAVA_HOME
    export PATH
    export CLASSPATH
    
    source /etc/profile
    
    3.设置hadoop用户
    useradd hadoop && echo hadoop | passwd --stdin hadoop
    echo "hadoopALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
    su - hadoop
    
    4.安装hadoop2.7

    安装

    解压安装在/home/hadoop目录下
    

    配置环境变量

    export HADOOP_HOME=/home/hadoop/hadoop/
    export PATH=$HADOOP_HOME/bin:$PATH
    

    创建目录

    mkdir -p /home/hadoop/dfs/{name,data}
    mkdir -p /home/hadoop/tmp
    

    创建备份目录

    mkdir -p /data/hdfs/{name,data}
    chown -R hadoop:hadoop /data/
    

设置ssh

  • 设置主节点和其它节点(${username}指的是默认登录用户名)
ssh-keygen -t rsa
ssh-copy-id ${username}@192.168.0.101
ssh-copy-id ${username}@192.168.0.102
ssh-copy-id ${username}@192.168.0.103
  • 测试ssh登录
ssh ${username}@192.168.0.101

修改hadoop配置文件(/home/hadoop/hadoop/etc/hadoop)

  • hadoop-env.sh(配置JAVA_HOME)
# The java implementation to use.
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/java/jdk1.8.0_171/
  • yarn-evn.sh(配置JAVA_HOME)
# some Java parameters
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/java/jdk1.8.0_171/
  • slaves(配置主机名)
node1
node2
node3
  • core-site.xml

  
    fs.default.name
    hdfs://node1:9000
  
  
    io.file.buffer.size
    131072
  
  
    hadoop.tmp.dir
    file:/home/hadoop/tmp
    Abase for other temporary directories.
  

  • hdfs-site.xml

  
    dfs.namenode.secondary.http-address
    node1:9001
    # 通过web界面来查看HDFS状态 
  
  
    dfs.namenode.name.dir
    file:/home/hadoop/dfs/name
  
  
    dfs.datanode.data.dir
    file:/home/hadoop/dfs/data
  
  
    dfs.replication
    2
    # 每个Block有2个备份
  
  
    dfs.webhdfs.enabled
    true
  

  • mapred-site.xml

  
    mapreduce.framework.name
    yarn
  
  
    mapreduce.jobhistory.address
    node1:10020
  
  
    mapreduce.jobhistory.webapp.address
    node1:19888
  

  • yarn-site.xml



  
  
    yarn.nodemanager.aux-services
    mapreduce_shuffle
  
  
    yarn.nodemanager.aux-services.mapreduce.shuffle.class
    org.apache.hadoop.mapred.ShuffleHandler
  
  
    yarn.resourcemanager.address
    node1:8032
  
  
    yarn.resourcemanager.scheduler.address
    node1:8030
  
  
    yarn.resourcemanager.resource-tracker.address
    node1:8031
  
  
    yarn.resourcemanager.admin.address
    node1:8033
  
  
    yarn.resourcemanager.webapp.address
    node1:8088
  
  
    yarn.nodemanager.resource.memory-mb
    8192
  

  • 复制hadoop到其它节点
scp -r /home/hadoop/hadoop/ 192.168.0.102:/home/hadoop/
scp -r /home/hadoop/hadoop/ 192.168.0.103:/home/hadoop/

初始化和运行

  • 初始化(只在node1主节点运行)
/home/hadoop/hadoop/bin/hdfs namenode -format
yum install tree
tree /home/hadoop/dfs
  • 启动hadoop(hadoop权限)
/home/hadoop/hadoop/sbin/start-dfs.sh

查看进程

ps aux | grep --color namenode
ps aux | grep --color datanode
  • 关闭hadoop(hadoop权限)
/home/hadoop/hadoop/sbin/stop-dfs.sh
  • 启动yarn分布式计算框架
/home/hadoop/hadoop/sbin/start-yarn.sh starting yarn daemons
ps aux | grep --color resourcemanager
ps aux | grep --color nodemanager
  • 简易启动/停止
/home/hadoop/hadoop/sbin/start-all.sh
/home/hadoop/hadoop/sbin/stop-all.sh
  • 查看hdfs分布式文件系统状态
/home/hadoop/hadoop/bin/hdfs dfsadmin -report
  • 用网页查看内容
192.168.0.101:50070

你可能感兴趣的:(Centos6.8搭建Hadoop集群)