应用场景
之前介绍了单节点,部署伪分布式hadoop集群,可以作为自己使用,但是真正投入生产环境,伪分布式是不够的,仅仅作为个人研究测试使用,此时我们需要部署搭建hadoop完全分布式集群,此hadoop性能将更加强悍,满足生产需求,下面就搭建Apache Hadoop2.6.0环境演示。
网络配置博客
# systemctl stop firewalld.service
# systemctl disable firewalld.service
# vim /etc/selinux/config #设置selinux = disabled
# vim /etc/hostname #3个节点分别命为hadoop0,hadoop1,hadoop2
# vim /etc/hosts #添加3个节点ip以及对应的主机名
配置SSH互信博客
NTP配置时间同步博客
jdk安装教程博客
mysql安装教程博客
hadoop2.6.0下载地址
8.1 安装hadoop 集群
操作步骤:
1.将下载的hadoop2.6.0压缩包,上传到主节点的opt目录下
2.进行解压缩
3.配置环境变量
4.新建所需要的目录
# cd /opt
# tar -xzvf hadoop-2.6.0-x64.tar.gz
# mv hadoop-2.6.0 hadoop2.6.0 #解压hadoop安装包,并且修改目录为hadoop2.6.0
# vim /etc/profile 修改配置文件,加入hadoop的环境变量
export JAVA_HOME=/opt/jdk1.8
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/opt/hadoop2.6.0
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
#mkdir /opt/hadoop2.6.0/tmp #创建目录,后续搭建过程中需要使用
#mkdir /opt/hadoop2.6.0/var
#mkdir /opt/hadoop2.6.0/dfs
#mkdir /opt/hadoop2.6.0/dfs/name
#mkdir /opt/hadoop2.6.0/dfs/data
8.2 修改hadoop-env.sh文件
# cd /opt/hadoop2.6.0/etc/hadoop/
# vim hadoop-env.sh
将:export JAVA_HOME=${JAVA_HOME}
修改为:export JAVA_HOME=/opt/jdk1.8 #修改为jdk目录
8.3 修改slaves文件
# cd /opt/hadoop2.6.0/etc/hadoop/
# vim slaveshadoop0
hadoop1
hadoop2
#此时是这种情况,hadoop0作为主节点,以及主备节点,管理节点,而同时hadoop0,hadoop1,hadoop2都作为数据节点!
8.4 修改core-site.xml文件
# cd /opt/hadoop2.6.0/etc/hadoop/
# vim core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dirname>
<value>/opt/hadoop2.6.0/tmpvalue>
<description>Abase for other temporary directories.description>
property>
<property>
<name>fs.default.namename>
<value>hdfs://hadoop0:9000value>
property>
configuration>
8.5 修改hdfs-site.xml文件
# cd /opt/hadoop2.6.0/etc/hadoop/
# vim hdfs-site.xml
<property>
dfs.name.dir
/opt/hadoop2.6.0/dfs/name
Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.
property
>
<property>
dfs.data.dir
/opt/hadoop2.6.0/dfs/data
Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.
property>
<property>
dfs.replication
3
property>
<property>
dfs.permissions
false
need not permissions
property>
8.6 修改mapred-site.xml文件
# cd /opt/hadoop2.6.0/etc/hadoop/
# cp mapred-site.xml.template mapred-site.xml
# vim hdfs-site.xml
<configuration>
<property>
<name>mapred.job.trackername>
<value>hadoop0:49001value>
property>
<property>
<name>mapred.local.dirname>
<value>/opt/hadoop2.6.0/varvalue>
property>
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>
configuration>
8.6 修改yarn-site.xml文件
# cd /opt/hadoop2.6.0/etc/hadoop/
# vim yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostnamename>
<value>hadoop0value>
property>
<property>
<description>The address of the applications manager interface in the RM.description>
<name>yarn.resourcemanager.addressname>
<value>${yarn.resourcemanager.hostname}:8032value>
property>
<property>
<description>The address of the scheduler interface.description>
<name>yarn.resourcemanager.scheduler.addressname>
<value>${yarn.resourcemanager.hostname}:8030value>
property>
<property>
<description>The http address of the RM web application.description>
<name>yarn.resourcemanager.webapp.addressname>
<value>${yarn.resourcemanager.hostname}:8088value>
property>
<property>
<description>The https adddress of the RM web application.description>
<name>yarn.resourcemanager.webapp.https.addressname>
<value>${yarn.resourcemanager.hostname}:8090value>
property>
<property>
<name>yarn.resourcemanager.resource-tracker.addressname>
<value>${yarn.resourcemanager.hostname}:8031value>
property>
<property>
<description>The address of the RM admin interface.description>
<name>yarn.resourcemanager.admin.addressname>
<value>${yarn.resourcemanager.hostname}:8033value>
property>
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
<property>
<name>yarn.scheduler.maximum-allocation-mbname>
<value>12288value>
<discription>每个节点可用内存,单位MB,默认8182MBdiscription>
property>
configuration>
**注:在主节点上配置好hadoop包后,同步到另外两个节点,配置不用修改,三个节点的配置都一样!
拷贝过去要注意目录是否有权限:chmod 777 -R /opt/hadoop2.6.0 【如果没有权限,会导致data节点无法启动】**
在管理节点上进行初始化以及启动
# cd /opt/hadoop2.6.0/bin
# ./hadoop namenode -format #初始化hadoop集群
格式化成功后,可以在看到在/opt/hadoop2.6.0/dfs/name/目录多了一个current目录,而且该目录内有4个文件。
# cd /opt/hadoop2.6.0/sbin
# ./start-all.sh #启动hadoop集群
10.1 分别在3个节点上参看进程,用jps命令
10.2 web页面测试
版权声明:个人学习,转载自 https://blog.csdn.net/bingoxubin/article/details/78503370,如有侵权请告知