hadoop101 | hadoop102 | hadoop103 |
---|---|---|
192.168.171.101 | 192.168.171.102 | 192.168.171.103 |
namenode | secondary namenode | recource manager |
datanode | datanode | datanode |
nodemanager | nodemanager | nodemanager |
job history | ||
job log | job log | job log |
yum -y update
yum -y install gcc gcc-c++ autoconf automake cmake make \
zlib zlib-devel openssl openssl-devel pcre-devel \
rsync openssh-server vim man zip unzip net-tools tcpdump lrzsz tar wget
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
setenforce 0
systemctl stop firewalld
systemctl disable firewalld
hostnamectl set-hostname hadoop101
hostnamectl set-hostname hadoop102
hostnamectl set-hostname hadoop103
vim /etc/sysconfig/network-scripts/ifcfg-ens32
参考如下:
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="none"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens32"
DEVICE="ens32"
ONBOOT="yes"
IPADDR="192.168.171.101"
PREFIX="24"
GATEWAY="192.168.171.2"
DNS1="192.168.171.2"
IPV6_PRIVACY="no"
vim /etc/hosts
修改内容如下:
192.168.171.101 hadoop101
192.168.171.102 hadoop102
192.168.171.103 hadoop103
重启系统 注意:如果是虚拟机环境请关机 克隆
reboot
在所有主机节点创建软件目录
mkdir -p /opt/soft
以下操作在 hadoop101 主机上完成
进入软件目录
cd /opt/soft
下载 JDK
wget https://download.oracle.com/otn/java/jdk/8u391-b13/b291ca3e0c8548b5a51d5a5f50063037/jdk-8u391-linux-x64.tar.gz?AuthParam=1698206552_11c0bb831efdf87adfd187b0e4ccf970
下载 hadoop
wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.5/hadoop-3.3.5.tar.gz
解压 JDK 修改名称
解压 hadoop 修改名称
tar -zxvf jdk-8u391-linux-x64.tar.gz -C /opt/soft/
mv jdk1.8.0_391/ jdk-8
tar -zxvf hadoop-3.3.5.tar.gz -C /opt/soft/
mv hadoop-3.3.5/ hadoop-3
配置环境变量
vim /etc/profile.d/my_env.sh
编写以下内容:
export JAVA_HOME=/opt/soft/jdk-8
export set JAVA_OPTS="--add-opens java.base/java.lang=ALL-UNNAMED"
export HDFS_NAMENODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_ZKFC_USER=root
export HDFS_JOURNALNODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export HADOOP_HOME=/opt/soft/hadoop-3
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
生成新的环境变量
注意:分发软件和配置文件后 在所有主机执行该步骤
source /etc/profile
创建本地秘钥并将公共秘钥写入认证文件
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
ssh-copy-id root@hadoop101
ssh-copy-id root@hadoop102
ssh-copy-id root@hadoop103
ssh root@hadoop101
exit
ssh root@hadoop102
exit
ssh root@hadoop101
exit
cd $HADOOP_HOME/etc/hadoop
hadoop-env.sh
core-site.xml
hdfs-site.xml
workers
mapred-site.xml
yarn-site.xml
hadoop-env.sh 文件末尾追加
export JAVA_HOME=/opt/soft/jdk-8
export set JAVA_OPTS="--add-opens java.base/java.lang=ALL-UNNAMED"
export HDFS_NAMENODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_ZKFC_USER=root
export HDFS_JOURNALNODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
<configuration>
<property>
<name>fs.defaultFSname>
<value>hdfs://hadoop101:8020value>
property>
<property>
<name>hadoop.tmp.dirname>
<value>/home/hadoop_datavalue>
property>
<property>
<name>hadoop.http.staticuser.username>
<value>rootvalue>
property>
<property>
<name>dfs.permissions.enabledname>
<value>falsevalue>
property>
<property>
<name>hadoop.proxyuser.root.hostsname>
<value>*value>
property>
<property>
<name>hadoop.proxyuser.root.groupsname>
<value>*value>
property>
configuration>
<configuration>
<property>
<name>dfs.replicationname>
<value>3value>
property>
<property>
<name>dfs.namenode.secondary.http-addressname>
<value>hadoop102:50090value>
property>
configuration>
注意:
hadoop2.x中该文件名为slaves
hadoop3.x中该文件名为workers
hadoop101
hadoop102
hadoop103
<configuration>
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>
<property>
<name>mapreduce.application.classpathname>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*value>
property>
<property>
<name>mapreduce.jobhistory.addressname>
<value>hadoop102:10020value>
property>
<property>
<name>mapreduce.jobhistory.webapp.addressname>
<value>hadoop102:19888value>
property>
configuration>
<configuration>
<property>
<name>yarn.resourcemanager.hostnamename>
<value>hadoop103value>
property>
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
<property>
<name>yarn.nodemanager.env-whitelistname>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOMEvalue>
property>
<property>
<name>yarn.nodemanager.pmem-check-enabledname>
<value>falsevalue>
property>
<property>
<name>yarn.nodemanager.vmem-check-enabledname>
<value>falsevalue>
property>
<property>
<name>yarn.log-aggregation-enablename>
<value>truevalue>
property>
<property>
<name>yarn.log.server.urlname>
<value>http://hadoop102:19888/jobhistory/logsvalue>
property>
<property>
<name>yarn.log-aggregation.retain-secondsname>
<value>604800value>
property>
configuration>
分发 ssh 免密钥
scp -r ~/.ssh root@hadoop102:~/
rsync -av --progress ~/.ssh root@hadoop103:~/
分发 hosts 文件
rsync -v --progress /etc/hosts root@hadoop102:/etc/
rsync -v --progress /etc/hosts root@hadoop103:/etc/
分发软件
rsync -av --progress /opt/soft/jdk-8 root@hadoop102:/opt/soft
rsync -av --progress /opt/soft/hadoop-3 root@hadoop102:/opt/soft
rsync -av --progress /opt/soft/jdk-8 root@hadoop103:/opt/soft
rsync -av --progress /opt/soft/hadoop-3 root@hadoop103:/opt/soft
分发环境变量
rsync -v --progress /etc/profile.d/my_env.sh root@hadoop102:/etc/profile.d/
rsync -v --progress /etc/profile.d/my_env.sh root@hadoop103:/etc/profile.d/
在所有主机节点 使新的环境变量生效
source /etc/profile
hadoop101
# 格式化文件系统
hdfs namenode -format
# 启动 NameNode SecondaryNameNode DataNode
start-dfs.sh
# 查看启动进程
jps
# hadoop101 看到 NameNode DataNode
# hadoop102 看到 SecondaryNameNode DataNode
# hadoop101 看到 DataNode
hadoop103
# 启动 ResourceManager daemon 和 NodeManager
start-yarn.sh
# 查看启动进程
jps
# hadoop101 看到 NameNode DataNode NodeManager
# hadoop102 看到 SecondaryNameNode DataNode NodeManager
# hadoop101 看到 DataNode ResourceManager NodeManager
hadoop102
# 启动 JobHistoryServer
mapred --daemon start historyserver
# 查看启动进程
jps
# hadoop101 看到 NameNode DataNode NodeManager
# hadoop102 看到 SecondaryNameNode DataNode NodeManager JobHistoryServer
# hadoop101 看到 DataNode ResourceManager NodeManager
# 关机之前 依关闭服务
# Hadoop102
mapred --daemon stop historyserver
# hadoop103
stop-yarn.sh
# hadoop101
stop-dfs.sh
# 开机后 依次开启服务
# hadoop101
start-dfs.sh
# hadoop103
start-yarn.sh
# hadoop102
mapred --daemon start historyserver
C:\Windows\System32\drivers\etc\hosts
追加以下内容:
192.168.171.101 hadoop101
192.168.171.102 hadoop102
192.168.171.103 hadoop103
Windows11 注意 修改权限
找到命令头提示符 以管理身份运行
进入 C:\Windows\System32\drivers\etc 目录
cd drivers/etc
去掉 hosts文件只读属性
attrib -r hosts
打开 hosts 配置文件
start hosts
追加以下内容后保存
192.168.171.101 hadoop101
192.168.171.102 hadoop102
192.168.171.103 hadoop103
浏览器访问: http://hadoop101:9870
浏览器访问:http://hadoop102:50090/
浏览器访问:http://hadoop103:8088
浏览器访问:http://hadoop102:19888/
本地文件系统创建 测试文件 wcdata.txt
vim wcdata.txt
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
FlinkHBase Flink
Hive StormHive Flink HadoopHBase
HiveHadoop Spark HBase StormHBase
Hadoop Hive FlinkHBase Flink Hive StormHive
Flink HadoopHBase Hive
Spark HBaseHive Flink
Storm Hadoop HBase SparkFlinkHBase
StormHBase Hadoop Hive
在 HDFS 上创建目录 /wordcount/input
hdfs dfs -mkdir -p /wordcount/input
查看 HDFS 目录结构
hdfs dfs -ls /
hdfs dfs -ls /wordcount
hdfs dfs -ls /wordcount/input
上传本地测试文件 wcdata.txt 到 HDFS 上 /wordcount/input
hdfs dfs -put wcdata.txt /wordcount/input
检查文件是否上传成功
hdfs dfs -ls /wordcount/input
hdfs dfs -cat /wordcount/input/wcdata.txt
计算 PI 的值
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.5.jar pi 10 10
单词统计
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.5.jar wordcount /wordcount/input/wcdata.txt /wordcount/result
hdfs dfs -ls /wordcount/result
hdfs dfs -cat /wordcount/result/part-r-00000