参考:https://blog.csdn.net/chirs_chen/article/details/84978941
虚拟机管理软件:Oracle VM VisualBox
虚拟机系统:CentOS-7-x86_64-Minimal-1810.iso
系统安装成功后,先启用 NAT 网卡和 Host-Only 网卡,将 ONBOOT 值改为 yes
配置文件路径 NAT 网卡 /etc/sysconfig/network-scripts/ifcfg-enp0s3 Host-Only 网卡 /etc/sysconfig/network-scripts/ifcfg-enp0s8 重启 network 服务
systemctl restart network
组件\主机名 | apache-hadoop-5 | apache-hadoop-6 | apache-hadoop-7 |
---|---|---|---|
HDFS | NameNode、SecondaryNameNode | DataNode | DataNode |
YARN | ResourceManager | NodeManager | NodeManager |
JDK | 1.8.0_192 | 1.8.0_192 | 1.8.0_192 |
apache hadoop 下载地址:http://hadoop.apache.org/releases.html
jdk 下载地址:
https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
创建节点 List 文件(不包含自己,以下例子在 apache-hadoop-5 上)
vi nodes
apache-hadoop-6
apache-hadoop-7
本地和远程 ssh 免密登录批处理脚本(需要同级目录下的 nodes 文件)
安装 expect 工具
yum -y install expect
编写免密登录脚本
vi freepw.sh
#!/bin/bash
PASSWORD=<服务器的root密码>
auto_ssh_copy_id() {
expect -c "set timeout -1;
spawn ssh-copy-id $1;
expect {
*(yes/no)* {send -- yes\r;exp_continue;}
*assword:* {send -- $2\r;exp_continue;}
eof {exit 0;}
}";
}
auto_ssh_copy_id localhost $PASSWORD
cat nodes | while read host
do
{
auto_ssh_copy_id $host $PASSWORD
}&wait
done
scp 远程传输文件批处理脚本(需要同级目录下的nodes文件)
vi scp.sh
#!/bin/bash
cat nodes | while read host
do
{
scp -r $1 $host:$2
}&wait
done
本地和远程执行 shell 命令批处理脚本(需要同级目录下的nodes文件)
vi run.sh
#!/bin/bash
$1
cat nodes | while read host
do
{
ssh $host $1
}&wait
done
以下操作无特别说明,均在 apache-hadoop-5 上执行
主机名修改(所有主机)
hostnamectl set-hostname
IP地址 | 主机名 |
---|---|
192.168.56.5 | apache-hadoop-5 |
192.168.56.6 | apache-hadoop-6 |
192.168.56.7 | apache-hadoop-7 |
设置主机名映射
设置映射主机名,将以下内容添加到 /etc/hosts 文件:
192.168.56.5 apache-hadoop-5
192.168.56.6 apache-hadoop-6
192.168.56.7 apache-hadoop-7
同步 /etc/hosts 文件到其他主机
sh scp.sh /etc/hosts /etc
设置免密登录(所有主机)
生成 rsa 公私钥对
ssh-keygen -t rsa -P ""
同步公钥到本机以及其他主机
sh freepw.sh
关闭防火墙和 selinux
关闭防火墙
sh run.sh "systemctl stop firewalld.service"
sh run.sh "systemctl disable firewalld.service"
关闭 selinux
vi /etc/sysconfig/selinux
将 SELINUX=enforcing 改为 SELINUX=disabled
sh scp.sh /etc/sysconfig/selinux /etc/sysconfig
时钟同步
安装 NTP
sh run.sh "yum -y install ntp"
添加 NTP 服务器
vi /etc/ntp.conf
添加以下内容:
server 0.cn.pool.ntp.org
server 1.cn.pool.ntp.org
server 2.cn.pool.ntp.org
server 3.cn.pool.ntp.org
sh scp.sh /etc/ntp.conf /etc
启动 ntpd 服务
sh run.sh "systemctl enable ntpd"
sh run.sh "systemctl start ntpd"
手工同步网络时间
sh run.sh "ntpdate -u 0.cn.pool.ntp.org"
同步系统时钟
sh run.sh "hwclock --systohc"
拷贝 JDK 到其他主机
sh scp.sh ~/jdk-8u192-linux-x64.tar.gz ~/
解压
sh run.sh "tar -zxf ~/jdk-8u192-linux-x64.tar.gz -C /opt"
配置环境变量
vi /etc/profile
添加以下内容:
#set jdk environment
export JAVA_HOME=/opt/jdk1.8.0_192
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH
export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin
sh scp.sh /etc/profile /etc
在所有主机上执行
source /etc/profile
预先配置好环境变量,然后先在一台主机上安装,配置好后在拷贝至其他主机
解压
tar -zxvf ~/hadoop-3.1.1.tar.gz -C /opt
配置环境变量
vi /etc/profile
添加以下内容:
#set hadoop environment
export HADOOP_HOME=/opt/hadoop-3.1.1
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
sh scp.sh /etc/profile /etc
在所有主机上执行
source /etc/profile
配置文件路径 | 说明 |
---|---|
$HADOOP_HOME/etc/hadoop/hadoop-env.sh | 配置 hadoop 的环境变量 |
$HADOOP_HOME/etc/hadoop/yarn-env.sh | 配置 yarn 的环境变量 |
$HADOOP_HOME/etc/hadoop/workers | 配置 DataNode 节点 |
$HADOOP_HOME/etc/hadoop/core-site.xml | 配置核心参数 |
$HADOOP_HOME/etc/hadoop/hdfs-site.xml | 配置 hsfs 相关参数 |
$HADOOP_HOME/etc/hadoop/mapred-site.xml | 配置 mapreduce 相关参数 |
$HADOOP_HOME/etc/hadoop/yarn-site.xml | 配置 yarn 相关参数 |
$HADOOP_HOME/etc/hadoop/hadoop-env.sh
配置 jdk 和 hadoop 根目录
export JAVA_HOME=/opt/jdk1.8.0_192
export HADOOP_HOME=/opt/hadoop-3.1.1
$HADOOP_HOME/etc/hadoop/yarn-env.sh
配置 jdk 根目录
export JAVA_HOME=/opt/jdk1.8.0_192
$HADOOP_HOME/etc/hadoop/workers
配置 DataNode 节点 hostname
apache-hadoop-6
apache-hadoop-7
$HADOOP_HOME/etc/hadoop/core-site.xml
创建 hadoop 临时目录
mkdir -p $HADOOP_HOME/tmp
配置
fs.defaultFS
hdfs://apache-hadoop-5:8020
hadoop.tmp.dir
/opt/hadoop-3.1.1/tmp
$HADOOP_HOME/etc/hadoop/hdfs-site.xml
创建
mkdir -p $HADOOP_HOME/hdfs/name $HADOOP_HOME/hdfs/data
配置
dfs.namenode.http-address
apache-hadoop-5:50070
dfs.namenode.secondary.http-address
apache-hadoop-5:50090
dfs.namenode.name.dir
/opt/hadoop-3.1.1/hdfs/name
dfs.datanode.data.dir
/opt/hadoop-3.1.1/hdfs/data
dfs.replication
2
$HADOOP_HOME/etc/hadoop/mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
apache-hadoop-5:10020
mapreduce.jobhistory.webapp.address
apache-hadoop-5:19888
$HADOOP_HOME/etc/hadoop/yarn-site.xml
yarn.resourcemanager.hostname
apache-hadoop-5
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
yarn.resourcemanager.address
${yarn.resourcemanager.hostname}:8032
yarn.resourcemanager.scheduler.address
${yarn.resourcemanager.hostname}:8030
yarn.resourcemanager.resource-tracker.address
${yarn.resourcemanager.hostname}:8035
yarn.resourcemanager.admin.address
${yarn.resourcemanager.hostname}:8033
yarn.resourcemanager.webapp.address
${yarn.resourcemanager.hostname}:8088
若没有增加用户定义,启动 hadoop 会报以下错误:
Starting namenodes on [apache-hadoop-5] #!/usr/bin/env bash ERROR: Attempting to operate on hdfs namenode as root #!/usr/bin/env bash ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation. Starting datanodes #!/usr/bin/env bash ERROR: Attempting to operate on hdfs datanode as root #!/usr/bin/env bash ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation. #!/usr/bin/env bash Starting secondary namenodes [apache-hadoop-5] ERROR: Attempting to operate on hdfs secondarynamenode as root #!/usr/bin/env bash ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation. #!/usr/bin/env bash Starting resourcemanager ERROR: Attempting to operate on yarn resourcemanager as root ERROR: but there is no YARN_RESOURCEMANAGER_USER defined. Aborting operation. Starting nodemanagers ERROR: Attempting to operate on yarn nodemanager as root ERROR: but there is no YARN_NODEMANAGER_USER defined. Aborting operation.
编辑并保存 $HADOOP_HOME/sbin/start-dfs.sh 和 $HADOOP_HOME/sbin/stop-dfs.sh 脚本,在文件顶部增加以下内容:
#!/usr/bin/env bash
# Add user defined
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
编辑并保存 $HADOOP_HOME/sbin/start-yarn.sh 和 $HADOOP_HOME/sbin/stop-yarn.sh 脚本,在文件顶部增加以下内容:
#!/usr/bin/env bash
# Add user defined
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
sh scp.sh /opt/hadoop-3.1.1 /opt/hadoop-3.1.1
格式化 hdfs
hdfs namenode -format
启动 hadoop
start-all.sh
查看启动节点
可以通过查看节点来判断是否启动成功
jps
关闭 hadoop
stop-all.sh
查看 hdfs 目录:hadoop fs -ls /
新建 hdfs 目录:hadoop fs -mkdir -p /hdfs
上传文件到 hdfs:hadoop fs -put freewp.sh /hdfs
重命名 hdfs 目录:hadoop fs -mv /hdfs /dfs
查看 hdfs 文件内容:hadoop fs -cat /dfs/freewp.sh
删除 hdfs 目录:hadoop fs -rm -r /dfs
新建 hdfs 目录:hadoop fs -mkdir -p /hdfs
上传文件到 hdfs :hadoop fs -put freewp.sh /hdfs
对 freepw.sh 执行 wordcount 任务,命令如下:
yarn jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /hdfs/freepw.sh /hdfs/temp-wordcount-out
执行过程中出现两个错误,查看问题解决后的执行结果:
hadoop fs -cat /hdfs/temp-wordcount-out/part-r-00000
结果:
"set 1
#!/bin/bash 1
$1; 1
$2\r;exp_continue;} 1
$PASSWORD 2
$host 1
*(yes/no)* 1
*assword:* 1
-- 2
-1; 1
-c 1
0;} 1
PASSWORD=chenpanyu 1
auto_ssh_copy_id 2
auto_ssh_copy_id() 1
cat 1
do 1
done 1
eof 1
expect 2
host 1
localhost 1
nodes 1
read 1
spawn 1
ssh-copy-id 1
timeout 1
while 1
yes\r;exp_continue;} 1
{ 3
{exit 1
{send 2
| 1
} 1
}"; 1
}&wait 1
问题一
错误内容:
[2018-12-28 15:45:42.628]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
yarn.app.mapreduce.am.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}
mapreduce.map.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}
mapreduce.reduce.env
HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}
分析:
hadoop 3.0.0 版本之后,各个 service 的环境变量已经不互相继承,必须通过 configuration 设定才行。
解决方案:
给 $HADOOP_HOME/etc/hadoop/mapred-site.xml 增加配置,并同步所有主机,如下:
hadoop.mapreduce.home
/opt/hadoop-3.1.1
yarn.app.mapreduce.am.env
HADOOP_MAPRED_HOME=${hadoop.mapreduce.home}
mapreduce.map.env
HADOOP_MAPRED_HOME=${hadoop.mapreduce.home}
mapreduce.reduce.env
HADOOP_MAPRED_HOME=${hadoop.mapreduce.home}
问题二
错误内容:
[2018-12-28 16:48:24.606]Container [pid=15778,containerID=container_1545983029903_0003_01_000002] is running 463743488B beyond the 'VIRTUAL' memory limit. Current usage: 85.1 MB of 1 GB physical memory used; 2.5 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1545983029903_0003_01_000002 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 15787 15778 15778 15778 (java) 815 22 2602704896 21470 /opt/jdk1.8.0_192/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820m -Djava.io.tmpdir=/opt/hadoop-3.1.1/tmp/nm-local-dir/usercache/root/appcache/application_1545983029903_0003/container_1545983029903_0003_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/hadoop-3.1.1/logs/userlogs/application_1545983029903_0003/container_1545983029903_0003_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.56.6 40118 attempt_1545983029903_0003_m_000000_0 2
|- 15778 15775 15778 15778 (bash) 1 0 115896320 306 /bin/bash -c /opt/jdk1.8.0_192/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820m -Djava.io.tmpdir=/opt/hadoop-3.1.1/tmp/nm-local-dir/usercache/root/appcache/application_1545983029903_0003/container_1545983029903_0003_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/hadoop-3.1.1/logs/userlogs/application_1545983029903_0003/container_1545983029903_0003_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.56.6 40118 attempt_1545983029903_0003_m_000000_0 2 1>/opt/hadoop-3.1.1/logs/userlogs/application_1545983029903_0003/container_1545983029903_0003_01_000002/stdout 2>/opt/hadoop-3.1.1/logs/userlogs/application_1545983029903_0003/container_1545983029903_0003_01_000002/stderr
[2018-12-28 16:48:24.871]Container killed on request. Exit code is 143
[2018-12-28 16:48:24.900]Container exited with a non-zero exit code 143.
解决方案:
给 $HADOOP_HOME/etc/hadoop/yarn-site.xml 增加配置,并同步所有主机,如下:
yarn.nodemanager.vmem-check-enabled
false
Whether virtual memory limits will be enforced for containers
yarn.nodemanager.vmem-pmem-ratio
4
Ratio between virtual memory to physical memory when setting memory limits for containers