/etc/hosts这个文件是记载LAN内接续的各主机的对应[HostName和IP]用的。
在LAN内,我们各个主机间访问通信的时候,用的是内网的IP地址进行访问(例:192.168.1.22,192.168.1.23),从而确立连接进行通信。
除了通过访问IP来确立通信访问之外,我们还可以通过HostName进行访问,我们在安装机器的时候都会给机器起一个名字,这个名字就是这台机器的HostName
假如HostA的 hostname是centos1,HostB的hostname是centos2那我们怎么能不但通过IP确立连接,通过这个IP对应的HostName进行连接访问呢?解决的办法就是这个/etc/hosts这个文件,通过把LAN内的各主机的IP地址和HostName的一一对应写入这个文件的时候,就可以解决问题。
配置主机名和IP映射关系 ,最下面一行是增加的
[root@CentOS ~]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
# IP是你当前机器的IP,相当于给你当前的机器起了一个别名
`IP CentOS`
在分布式系统中很多服务都是以主机名标示节点,因此配置IP和主机名的映射关系.用户可以查看以下文件
这个/etc/sysconfig/network文件是定义hostname和是否利用网络的不接触网络设备的对系统全体定义的文件。
设定形式:设定值=值
设定项目如下:
NETWORKING 是否利用网络
GATEWAY 默认网关
IPGATEWAYDEV 默认网关的接口名
HOSTNAME 主机名
DOMAIN 域名
[root@CentOS ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=CentOS
关闭并禁用防火墙
分布式服务之间可能会产生相互的调度,为了保证正常的通信,一般需要关闭防火墙。
[root@CentOS ~]# systemctl stop firewalld.service
[root@CentOS ~]# systemctl status firewalld.service
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Wed 2020-03-18 22:26:51 EDT; 2min 52s ago
Docs: man:firewalld(1)
Process: 11429 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
Main PID: 11429 (code=exited, status=0/SUCCESS)
Mar 18 22:21:16 CentOSC systemd[1]: Starting firewalld - dynamic firewal....
Mar 18 22:21:41 CentOSC systemd[1]: Started firewalld - dynamic firewall....
Mar 18 22:26:50 CentOSC systemd[1]: Stopping firewalld - dynamic firewal....
Mar 18 22:26:51 CentOSC systemd[1]: Stopped firewalld - dynamic firewall....
Hint: Some lines were ellipsized, use -l to show in full.
[root@CentOS ~]# systemctl disable firewalld.service
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
配置主机SSH免密码认证(密匙)
[root@CentOS ~]# ssh-keygen -t rsa
[root@CentOS ~]# ssh-copy-id IP
[root@CentOS ~]# tar -zxf hadoop-2.6.0_x64.tar.gz -C /usr/
[root@CentOS ~]# vi /root/.bashrc # 添加以下各项
HADOOP_HOME=/usr/hadoop-2.6.0
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_HOME
[root@CentOS ~]# source /root/.bashrc
[root@CentOS ~]# echo $HADOOP_HOME
/usr/hadoop-2.6.0
配置Hadoop
编辑 /usr/hadoop-2.6.0/etc/hadoop/core-site.xml 加入以下配置
<property>
<name>fs.defaultFSname>
<value>hdfs://IP:9000value>
property>
<property>
<name>hadoop.tmp.dirname>
<value>/usr/hadoop-2.6.0/hadoop-${user.name}value>
property>
<property>
<name>fs.trash.intervalname>
<value>1value>
property>
/usr/hadoop-2.6.0/etc/hadoop/hdfs-site.xml 加入以下配置
<property>
<name>dfs.replicationname>
<value>1value>
property>
/usr/hadoop-2.6.0/etc/hadoop/slaves 改为以下配置
当前机器IP
启动HDFS
# 格式化HDFS
[root@CentOS ~]# hdfs namenode -format
# 出现以下日志表示成功
"""
19/01/02 20:19:37 INFO common.Storage: Storage directory /usr/hadoop-2.6.0/hadoop- root/dfs/name has been successfully formatted.
"""
[root@CentOS hadoop]# tree /usr/hadoop-2.6.0/hadoop-root/
/usr/hadoop-2.6.0/hadoop-root/
└── dfs
└── name
└── current
├── fsimage_0000000000000000000
├── fsimage_0000000000000000000.md5
├── seen_txid
└── VERSION
3 directories, 4 files
# 启动HDFS
[root@CentOS hadoop]# start-dfs.sh
# 出现以下日志表示成功
"""
Starting namenodes on [CentOS]
CentOS: starting namenode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-namenode-CentOS.out
CentOS: starting datanode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-datanode-CentOS.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:ptOfP+xxYMRrBJLeNsNwUZIJ94bTeGiRqTbLjCfwMyo.
ECDSA key fingerprint is MD5:28:6e:4e:68:dd:c1:95:38:bc:76:cf:30:ef:30:0f:d2.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-CentOS.out
"""
# 停止HDFS
[root@CentOS ~]# stop-dfs.sh
访问浏览器:http://IP:50070
# 测试下放个文件
[root@CentOS ~]# hdfs dfs -put /root/jdk-8u171-linux-x64.rpm /
[root@CentOS ~]# hdfs dfs -ls /
Found 1 items
-rw-r--r-- 1 root supergroup 175262413 2020-03-19 03:54 /jdk-8u171-linux-x64.rpm
搭建yarn环境
编辑 /usr/hadoop-2.6.0/etc/hadoop/yarn-site.xml 加入以下配置
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
<property>
<name>yarn.resourcemanager.hostnamename>
<value>IPvalue>
property>
/usr/hadoop-2.6.0/etc/hadoop/mapred-site.xml
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>
启动yarn
[root@CentOS ~]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/hadoop-2.6.0/logs/yarn-root-resourcemanager-CentOS.out
CentOS: starting nodemanager, logging to /usr/hadoop-2.6.0/logs/yarn-root-nodemanager-CentOS.out
[root@CentOS ~]# jps
32160 NodeManager
27906 NameNode
32072 ResourceManager
27995 DataNode
28188 SecondaryNameNode
32477 Jps
访问:http://IP:8088/
CentOSA | CentOSB | CentOSC |
---|---|---|
zookeeper | zookeeper | zookeeper |
zkfc | zkfc | |
nn1 | nn2 | |
journalnode | journalnode | journalnode |
datenode | datanode | datanode |
rm1 | rm2 | |
nodemanager | nodemanager | nodemanager |
# 时钟同步
[root@CentOSX ~]# date -s '2020-03-19 16:38:15'
Thu Mar 19 16:38:15 EDT 2020
[root@CentOSX ~]# clock -w
注:所有操作中CentOSA的表示在A上执行,B表示在B上执行,C表示在C上执行,X表示所有机器均执行
安装zookeeper
[root@CentOSX ~]# tar -zxf zookeeper-3.4.6.tar.gz -C /usr/
[root@CentOSX ~]# vi /usr/zookeeper-3.4.6/conf/zoo.cfg
tickTime=2000
dataDir=/root/zkdata
clientPort=2181
initLimit=5
syncLimit=2
server.1=CentOSA:2887:3887
server.2=CentOSB:2887:3887
server.3=CentOSC:2887:3887
[root@CentOSX ~]# mkdir /root/zkdata
[root@CentOSA ~]# echo 1 >> zkdata/myid
[root@CentOSB ~]# echo 2 >> zkdata/myid
[root@CentOSC ~]# echo 3 >> zkdata/myid
[root@CentOSX ~]# /usr/zookeeper-3.4.6/bin/zkServer.sh start zoo.cfg
[root@CentOSX ~]# /usr/zookeeper-3.4.6/bin/zkServer.sh status zoo.cfg
配置Hadoop
[root@CentOSC ~]# tar -zxf hadoop-2.6.0_x64.tar.gz -C /usr/
[root@CentOSC ~]# vi /root/.bashrc
HADOOP_HOME=/usr/hadoop-2.6.0
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_HOME
[root@CentOSC ~]# source /root/.bashrc
[root@CentOSC ~]# echo $HADOOP_HOME
/usr/hadoop-2.6.0
编辑 /usr/hadoop-2.6.0/etc/hadoop/core-site.xml 加入以下配置
<property>
<name>fs.defaultFSname>
<value>hdfs://myclustervalue>
property>
<property>
<name>hadoop.tmp.dirname>
<value>/usr/hadoop-2.6.0/hadoop-${user.name}value>
property>
<property>
<name>fs.trash.intervalname>
<value>1value>
property>
<property>
<name>net.topology.script.file.namename>
<value>/usr/hadoop-2.6.0/etc/hadoop/rack.shvalue>
property>
创建 /usr/hadoop-2.6.0/etc/hadoop/rack.sh 脚本
while [ $# -gt 0 ] ; do
nodeArg=$1
exec</usr/hadoop-2.6.0/etc/hadoop/topology.data
result=""
while read line ; do
ar=( $line )
if [ "${ar[0]}" = "$nodeArg" ] ; then
result="${ar[1]}"
fi
done
shift
if [ -z "$result" ] ; then
echo -n "/default-rack"
else
echo -n "$result "
fi
done
# 为脚本增加执行权限
[root@CentOSA ~]# ll /usr/hadoop-2.6.0/etc/hadoop/rack.sh
-rwxr--r--. 1 root root 358 Mar 19 16:59 /usr/hadoop-2.6.0/etc/hadoop/rack.sh
创建机架映射文件 /usr/hadoop-2.6.0/etc/hadoop/topology.data
192.168.49.146 /rack1
192.168.49.147 /rack1
192.168.49.148 /rack2
/usr/hadoop-2.6.0/etc/hadoop/hdfs-site.xml 加入以下配置
<property>
<name>dfs.replicationname>
<value>3value>
property>
<property>
<name>dfs.ha.automatic-failover.enabledname>
<value>truevalue>
property>
<property>
<name>ha.zookeeper.quorumname>
<value>CentOSA:2181,CentOSB:2181,CentOSC:2181value>
property>
<property>
<name>dfs.nameservicesname>
<value>myclustervalue>
property>
<property>
<name>dfs.ha.namenodes.myclustername>
<value>nn1,nn2value>
property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1name>
<value>CentOSA:9000value>
property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2name>
<value>CentOSB:9000value>
property>
<property>
<name>dfs.namenode.shared.edits.dirname>
<value>qjournal://CentOSA:8485;CentOSB:8485;CentOSC:8485/myclustervalue>
property>
<property>
<name>dfs.client.failover.proxy.provider.myclustername>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue>
property>
<property>
<name>dfs.ha.fencing.methodsname>
<value>sshfencevalue>
property>
<property>
<name>dfs.ha.fencing.ssh.private-key-filesname>
<value>/root/.ssh/id_rsavalue>
property>
/usr/hadoop-2.6.0/etc/hadoop/slaves 改为以下配置
CentOSA
CentOSB
CentOSC
启动HDFS
[root@CentOSX ~]# hadoop-daemon.sh start journalnode # 等上10秒钟,再进行下一步操作
[root@CentOSA ~]# hdfs namenode -format
[root@CentOSA ~]# hadoop-daemon.sh start namenode
[root@CentOSB ~]# hdfs namenode -bootstrapStandby #(下载active的namenode元数据)
[root@CentOSB ~]# hadoop-daemon.sh start namenode
[root@CentOSA|B ~]# hdfs zkfc -formatZK #(可以在CentOSA或者CentOSB任意一台注册namenode信息)
[root@CentOSA ~]# hadoop-daemon.sh start zkfc # (哨兵)
[root@CentOSB ~]# hadoop-daemon.sh start zkfc # (哨兵)
[root@CentOSX ~]# hadoop-daemon.sh start datanode
搭建yarn环境
编辑 /usr/hadoop-2.6.0/etc/hadoop/yarn-site.xml 加入以下配置
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
<property>
<name>yarn.resourcemanager.ha.enabledname>
<value>truevalue>
property>
<property>
<name>yarn.resourcemanager.zk-addressname>
<value>CentOSA:2181,CentOSB:2181,CentOSC:2181value>
property>
<property>
<name>yarn.resourcemanager.cluster-idname>
<value>rmcluster01value>
property>
<property>
<name>yarn.resourcemanager.ha.rm-idsname>
<value>rm1,rm2value>
property>
<property>
<name>yarn.resourcemanager.hostname.rm1name>
<value>CentOSBvalue>
property>
<property>
<name>yarn.resourcemanager.hostname.rm2name>
<value>CentOSCvalue>
property>
/usr/hadoop-2.6.0/etc/hadoop/mapred-site.xml
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>
启动yarn
[root@CentOSB ~]# yarn-daemon.sh start resourcemanager
[root@CentOSC ~]# yarn-daemon.sh start resourcemanager
[root@CentOSX ~]# yarn-daemon.sh start nodemanager