hadoop3.2.0集群安装

hadoop3.2.0集群安装

  • 集群安装步骤
    • 配置IP
    • 配置hostname
    • 配置hosts(IP和主机名映射)
    • 关闭防火墙和selinux
    • 修改swappiness参数并禁用透明大页面压缩
    • 配置时间同步
    • 安装JDK
    • 创建hadoop用户
    • 配置ssh无密钥证书
    • hadoop安装
    • hadoop验证
      • 进程验证
      • web界面验证
      • 查看状态报告验证

集群安装步骤

本篇文章是基于centos6.5系统的hadoop3.2.0安装,其他linux系统的其他hadoop版本也同样适用!
hadoop集群有1个Master节点,2个Slave节点,可根据需求自行扩展。

配置IP

[root@master ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=00:0C:29:E9:CA:59
TYPE=Ethernet
UUID=ca629425-b5c0-4dab-a66d-68831a690d8e
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=static
IPADDR=169.254.1.100
NETMASK=255.255.255.0

修改Master节点IP地址为169.254.1.100,其他两Slave节点IP地址分别为169.254.1.101,169.254.1.102。

使IP生效

[root@master ~]# service network restart

参数介绍:
https://www.cnblogs.com/dkblog/archive/2011/12/28/2305004.html
NM_CONTROLLED参数介绍:
https://blog.csdn.net/petrosofts/article/details/80346348

配置hostname

[root@master ~]# vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master

修改Master节点主机名为master,其他两Slave节点主机名分别为slave01,slave02。

重启使hostname生效

[root@master ~]# reboot

配置hosts(IP和主机名映射)

[root@master ~]# vim /etc/hosts
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
169.254.1.100 master
169.254.1.101 slave01
169.254.1.102 slave02

关闭防火墙和selinux

[root@master ~]# service iptables stop(临时关闭防火墙)
[root@master ~]# chkconfig iptables off(重启生效)
[root@master ~]# setenforce 0(临时关闭selinux)
[root@master ~]# vim /etc/selinux/config
修改
SELINUX=disabled(重启生效)

修改swappiness参数并禁用透明大页面压缩

[root@master ~]# echo 10 > /proc/sys/vm/swappiness
禁用大内存页面
[root@master ~]# echo never > /sys/kernel/mm/transparent_hugepage/defrag
[root@master ~]# vim /etc/rc.local
末尾追加
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Cloudera版本会要求修改这些参数,Apache版本可以忽略。

配置时间同步

Master节点配置如下:

[root@master ~]# vim /etc/ntp.conf
restrict 0.0.0.0 mask 0.0.0.0 nomodify notrap
server 127.127.1.0
fudge 127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
keys /etc/ntp/keys
includefile /etc/ntp/crypto/pw
restrict 127.0.0.1
restrict -6 ::1

两个Slave节点配置如下:

[root@slave01 ~]# vim /etc/ntp.conf
restrict 0.0.0.0 mask 0.0.0.0 nomodify notrap
restrict default kod nomodify notrap nopeernoquery
restrict -6 default kod nomodify notrapnopeer noquery
server master prefer
fudge 127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
keys /etc/ntp/keys
restrict 127.0.0.1
restrict -6 ::1

所有节点执行:

[root@master ~]# /etc/rc.d/init.d/ntpd start //启动ntp服务
[root@master ~]# chkconfig ntpd on //让ntp服务开机启

安装JDK

下载链接:https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

[root@master ~]# rpm -ivh jdk-8u201-linux-x64.rpm

查看JDK是否安装成功

[root@master ~]# java -version
java version “1.8.0_201”
Java™ SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot™ 64-Bit Server VM (build 25.201-b09, mixed mode)

配置环境变量

[root@master ~]# vim /etc/profile
末尾追加
export JAVA_HOME=/usr/java/jdk1.8.0_201-amd64
export JAVA_BIN=/usr/java/jdk1.8.0_201-amd64/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/jre/lib/rt.jar
export PATH=$PATH:$JAVA_HOME/bin

使环境变量生效

[root@master ~]# source /etc/profile

创建hadoop用户

[root@master ~]# useradd hadoop(创建用户)
[root@master ~]# echo “123” | passwd hadoop --stdin(修改密码)

配置ssh无密钥证书

[root@master ~]# su - hadoop
[hadoop@master ~]$ ssh-keygen -t rsa
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub master
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub slave01
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub slave02

上述操作3个节点都要执行

hadoop安装

下载链接:http://mirrors.shu.edu.cn/apache/

解压安装包

[hadoop@master ~]$ tar -zxvf hadoop-3.2.0.tar.gz

创建dfs相关目录

[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/name
[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/data
[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/namesecondary

进入hadoop配置文件目录

[hadoop@master ~]$ cd /home/hadoop/hadoop-3.2.0/etc/hadoop

修改配置文件core-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/core-default.xm
中增加如下内容

[hadoop@master hadoop]$ vi core-site.xml

fs.defaultFS
hdfs://master:9000
NameNode URI.


io.file.buffer.size
131072
Size of read/write buffer used inSequenceFiles.

修改配置文件hdfs-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xm
中增加如下内容

[hadoop@master hadoop]$ vi hdfs-site.xml

dfs.namenode.secondary.http-address
master:50090
The secondary namenode http server address andport.


dfs.namenode.name.dir
file:///home/hadoop/hadoop-3.2.0/dfs/name
Path on the local filesystem where the NameNodestores the namespace and transactions logs persistently.


dfs.datanode.data.dir
file:///home/hadoop/hadoop-3.2.0/dfs/data
Comma separated list of paths on the local filesystemof a DataNode where it should store its blocks.


dfs.namenode.checkpoint.dir
file:///home/hadoop/hadoop-3.2.0/dfs/namesecondary
Determines where on the local filesystem the DFSsecondary name node should store the temporary images to merge. If this is acomma-delimited list of directories then the image is replicated in all of thedirectories for redundancy.


dfs.replication
2

修改配置文件mapred-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
复制一个mapred-site.xml

[hadoop@master hadoop]$ cp mapred-site.xml.template mapred-site.xml

中增加如下内容

[hadoop@master hadoop]$ vi mapred-site.xml

mapreduce.framework.name
yarn
Theruntime framework for executing MapReduce jobs. Can be one of local, classic oryarn.


mapreduce.jobhistory.address
master:10020
MapReduce JobHistoryServer IPC host:port


mapreduce.jobhistory.webapp.address
master:19888
MapReduce JobHistoryServer Web UI host:port


mapreduce.application.classpath

/opt/hadoop-3.0.0/etc/hadoop,
/opt/hadoop-3.0.0/share/hadoop/common/*,
/opt/hadoop-3.0.0/share/hadoop/common/lib/*,
/opt/hadoop-3.0.0/share/hadoop/hdfs/*,
/opt/hadoop-3.0.0/share/hadoop/hdfs/lib/*,
/opt/hadoop-3.0.0/share/hadoop/mapreduce/*,
/opt/hadoop-3.0.0/share/hadoop/mapreduce/lib/*,
/opt/hadoop-3.0.0/share/hadoop/yarn/*,
/opt/hadoop-3.0.0/share/hadoop/yarn/lib/*


修改配置文件yarn-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
中增加如下内容

[hadoop@master hadoop]$ vi yarn-site.xml

yarn.resourcemanager.hostname
master
The hostname of theRM.


yarn.nodemanager.aux-services
mapreduce_shuffle
Shuffle service that needs to be set for Map Reduceapplications.

修改配置文件 hadoop-env.sh
末尾追加

[hadoop@master hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_201-amd64

修改配置文件workers

[hadoop@master hadoop]$ vi workers
slave01
slave02

复制hadoop3.2.0目录到两个Slave节点

[hadoop@master ~]$ scp -r /home/hadoop/hadoop-3.2.0 hadoop@slave01:/home/hadoop/
[hadoop@master ~]$ scp -r /home/hadoop/hadoop-3.2.0 hadoop@slave02:/home/hadoop/

将hadoop加入到主目录环境变量中

[hadoop@master ~]$ vi .bash_profile
PATH=$PATH:$HOME/bin
export HADOOP_HOME=/home/hadoop/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

使环境变量生效

[hadoop@master ~]$ . .bash_profile

格式化namenode

[hadoop@master ~]$ hadoop namenode -format

启动hadoop

[hadoop@master ~]$ start-all.sh

hadoop验证

进程验证

Master节点验证

[hadoop@master ~]$ jps
29633 NameNode
30071 Jps
29820 SecondaryNameNode
29965 ResourceManager

Slave01节点验证

[hadoop@slave01 ~]$ jps
28083 NodeManager
27978 DataNode
28158 Jps

Slave02节点验证

[hadoop@slave02 ~]$ jps
28176 Jps
28054 NodeManager
27947 DataNode

web界面验证

http://169.254.1.100:50070
hadoop3.2.0集群安装_第1张图片
http://169.254.1.100:8088
hadoop3.2.0集群安装_第2张图片

查看状态报告验证

[hadoop@master ~]$ hdfs dfsadmin -report
19/02/21 16:03:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Configured Capacity: 60932890624 (56.75 GB)
Present Capacity: 48979066880 (45.62 GB)
DFS Remaining: 48979017728 (45.62 GB)
DFS Used: 49152 (48 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks:

你可能感兴趣的:(大数据)