本篇文章是基于centos6.5系统的hadoop3.2.0安装,其他linux系统的其他hadoop版本也同样适用!
hadoop集群有1个Master节点,2个Slave节点,可根据需求自行扩展。
[root@master ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=00:0C:29:E9:CA:59
TYPE=Ethernet
UUID=ca629425-b5c0-4dab-a66d-68831a690d8e
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=static
IPADDR=169.254.1.100
NETMASK=255.255.255.0
修改Master节点IP地址为169.254.1.100,其他两Slave节点IP地址分别为169.254.1.101,169.254.1.102。
使IP生效
[root@master ~]# service network restart
参数介绍:
https://www.cnblogs.com/dkblog/archive/2011/12/28/2305004.html
NM_CONTROLLED参数介绍:
https://blog.csdn.net/petrosofts/article/details/80346348
[root@master ~]# vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master
修改Master节点主机名为master,其他两Slave节点主机名分别为slave01,slave02。
重启使hostname生效
[root@master ~]# reboot
[root@master ~]# vim /etc/hosts
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
169.254.1.100 master
169.254.1.101 slave01
169.254.1.102 slave02
[root@master ~]# service iptables stop(临时关闭防火墙)
[root@master ~]# chkconfig iptables off(重启生效)
[root@master ~]# setenforce 0(临时关闭selinux)
[root@master ~]# vim /etc/selinux/config
修改
SELINUX=disabled(重启生效)
[root@master ~]# echo 10 > /proc/sys/vm/swappiness
禁用大内存页面
[root@master ~]# echo never > /sys/kernel/mm/transparent_hugepage/defrag
[root@master ~]# vim /etc/rc.local
末尾追加
echo never > /sys/kernel/mm/transparent_hugepage/defrag
Cloudera版本会要求修改这些参数,Apache版本可以忽略。
Master节点配置如下:
[root@master ~]# vim /etc/ntp.conf
restrict 0.0.0.0 mask 0.0.0.0 nomodify notrap
server 127.127.1.0
fudge 127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
keys /etc/ntp/keys
includefile /etc/ntp/crypto/pw
restrict 127.0.0.1
restrict -6 ::1
两个Slave节点配置如下:
[root@slave01 ~]# vim /etc/ntp.conf
restrict 0.0.0.0 mask 0.0.0.0 nomodify notrap
restrict default kod nomodify notrap nopeernoquery
restrict -6 default kod nomodify notrapnopeer noquery
server master prefer
fudge 127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
keys /etc/ntp/keys
restrict 127.0.0.1
restrict -6 ::1
所有节点执行:
[root@master ~]# /etc/rc.d/init.d/ntpd start //启动ntp服务
[root@master ~]# chkconfig ntpd on //让ntp服务开机启
下载链接:https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
[root@master ~]# rpm -ivh jdk-8u201-linux-x64.rpm
查看JDK是否安装成功
[root@master ~]# java -version
java version “1.8.0_201”
Java™ SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot™ 64-Bit Server VM (build 25.201-b09, mixed mode)
配置环境变量
[root@master ~]# vim /etc/profile
末尾追加
export JAVA_HOME=/usr/java/jdk1.8.0_201-amd64
export JAVA_BIN=/usr/java/jdk1.8.0_201-amd64/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/jre/lib/rt.jar
export PATH=$PATH:$JAVA_HOME/bin
使环境变量生效
[root@master ~]# source /etc/profile
[root@master ~]# useradd hadoop(创建用户)
[root@master ~]# echo “123” | passwd hadoop --stdin(修改密码)
[root@master ~]# su - hadoop
[hadoop@master ~]$ ssh-keygen -t rsa
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub master
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub slave01
[hadoop@master ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub slave02
上述操作3个节点都要执行
下载链接:http://mirrors.shu.edu.cn/apache/
解压安装包
[hadoop@master ~]$ tar -zxvf hadoop-3.2.0.tar.gz
创建dfs相关目录
[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/name
[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/data
[hadoop@master hadoop-3.2.0]$ mkdir -p dfs/namesecondary
进入hadoop配置文件目录
[hadoop@master ~]$ cd /home/hadoop/hadoop-3.2.0/etc/hadoop
修改配置文件core-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-common/core-default.xm
中增加如下内容
[hadoop@master hadoop]$ vi core-site.xml
fs.defaultFS
hdfs://master:9000
NameNode URI.
io.file.buffer.size
131072
Size of read/write buffer used inSequenceFiles.
修改配置文件hdfs-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xm
中增加如下内容
[hadoop@master hadoop]$ vi hdfs-site.xml
dfs.namenode.secondary.http-address
master:50090
The secondary namenode http server address andport.
dfs.namenode.name.dir
file:///home/hadoop/hadoop-3.2.0/dfs/name
Path on the local filesystem where the NameNodestores the namespace and transactions logs persistently.
dfs.datanode.data.dir
file:///home/hadoop/hadoop-3.2.0/dfs/data
Comma separated list of paths on the local filesystemof a DataNode where it should store its blocks.
dfs.namenode.checkpoint.dir
file:///home/hadoop/hadoop-3.2.0/dfs/namesecondary
Determines where on the local filesystem the DFSsecondary name node should store the temporary images to merge. If this is acomma-delimited list of directories then the image is replicated in all of thedirectories for redundancy.
dfs.replication
2
修改配置文件mapred-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
复制一个mapred-site.xml
[hadoop@master hadoop]$ cp mapred-site.xml.template mapred-site.xml
中增加如下内容
[hadoop@master hadoop]$ vi mapred-site.xml
mapreduce.framework.name
yarn
Theruntime framework for executing MapReduce jobs. Can be one of local, classic oryarn.
mapreduce.jobhistory.address
master:10020
MapReduce JobHistoryServer IPC host:port
mapreduce.jobhistory.webapp.address
master:19888
MapReduce JobHistoryServer Web UI host:port
mapreduce.application.classpath
/opt/hadoop-3.0.0/etc/hadoop,
/opt/hadoop-3.0.0/share/hadoop/common/*,
/opt/hadoop-3.0.0/share/hadoop/common/lib/*,
/opt/hadoop-3.0.0/share/hadoop/hdfs/*,
/opt/hadoop-3.0.0/share/hadoop/hdfs/lib/*,
/opt/hadoop-3.0.0/share/hadoop/mapreduce/*,
/opt/hadoop-3.0.0/share/hadoop/mapreduce/lib/*,
/opt/hadoop-3.0.0/share/hadoop/yarn/*,
/opt/hadoop-3.0.0/share/hadoop/yarn/lib/*
修改配置文件yarn-site.xml
默认配置链接:http://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
中增加如下内容
[hadoop@master hadoop]$ vi yarn-site.xml
yarn.resourcemanager.hostname
master
The hostname of theRM.
yarn.nodemanager.aux-services
mapreduce_shuffle
Shuffle service that needs to be set for Map Reduceapplications.
修改配置文件 hadoop-env.sh
末尾追加
[hadoop@master hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_201-amd64
修改配置文件workers
[hadoop@master hadoop]$ vi workers
slave01
slave02
复制hadoop3.2.0目录到两个Slave节点
[hadoop@master ~]$ scp -r /home/hadoop/hadoop-3.2.0 hadoop@slave01:/home/hadoop/
[hadoop@master ~]$ scp -r /home/hadoop/hadoop-3.2.0 hadoop@slave02:/home/hadoop/
将hadoop加入到主目录环境变量中
[hadoop@master ~]$ vi .bash_profile
PATH=$PATH:$HOME/bin
export HADOOP_HOME=/home/hadoop/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
使环境变量生效
[hadoop@master ~]$ . .bash_profile
格式化namenode
[hadoop@master ~]$ hadoop namenode -format
启动hadoop
[hadoop@master ~]$ start-all.sh
Master节点验证
[hadoop@master ~]$ jps
29633 NameNode
30071 Jps
29820 SecondaryNameNode
29965 ResourceManager
Slave01节点验证
[hadoop@slave01 ~]$ jps
28083 NodeManager
27978 DataNode
28158 Jps
Slave02节点验证
[hadoop@slave02 ~]$ jps
28176 Jps
28054 NodeManager
27947 DataNode
http://169.254.1.100:50070
http://169.254.1.100:8088
[hadoop@master ~]$ hdfs dfsadmin -report
19/02/21 16:03:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Configured Capacity: 60932890624 (56.75 GB)
Present Capacity: 48979066880 (45.62 GB)
DFS Remaining: 48979017728 (45.62 GB)
DFS Used: 49152 (48 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: