Hadoop学习第二天-------Hadoop集群搭建

一、下载Hadoop安装包,我这里用到的是tar.gz格式的安装包,大家可以从官网自行下载
二、修改Hadoop配置文件
Hadoop设计的配置文件有以下七个并且都在$HADOOP_HOME/etc/hadoop/目录下:
- hadoop-env.sh
- yarn-env.sh
- slaves
- core-site.xml
- hdfs-site.xml
- mapred-site.xml.template
- yarn-site.xml
1. hadoop-env.sh文件的配置,修改JAVA_HOME的值export JAVA_HOME=/usr/local/jdk.1.8.0_191,这里是绝对地址
*2. yarn-env.sh文件的配置,修改JAVA_HOME的值,export JAVA_HOME=/usr/local/jdk.1.8.0_191
3. slaves文件的配置,一定要删除localhost,添加子虚拟机的名字
4. core-site.xml文件的配置,添加如下代码段:


        
                fs.defaultFS
                hdfs://master:8020
        
        
                hadoop.tmp.dir
                /var/log/hadoop/tmp
       	


5. hdfs-site.xml文件的配置,添加如下代码段:



	dfs.namenode.name.dir
        file:///data/hadoop/hdfs/name


        dfs.datanode.data.dir
        file:///data/hadoop/hdfs/data


	dfs.namenode.secondary.http-address
        master:50090


	dfs.replication
        3



6. mapred-site.xml文件的配置,添加如下代码段:



	mapreduce.framework.name
        yarn


	mapreduce.jobhistory.address
        master:10020


	mapreduce.jobhistory.webapp.adress
        master:19888



7. yarn-site.xml文件的修改,添加一下代码段:





	yarn.resourcemanager.hostname
        master


	yarn.resourcemanager.address
        ${yarn.resourcemanager.hostname}:8032


	yarn.resourcemanager.scheduler.address
        ${yarn.resourcemanager.hostname}:8030


	yarn.resourcemanager.webapp.address
        ${yarn.resourcemanager.hostname}:8088


	yarn.resourcemanager.webapp.https.address
        ${yarn.resourcemanager.hostname}:8090


	yarn.resourcemanager.resource-tracker.address
        ${yarn.resourcemanager.hostname}:8031


yarn.resourcemanager.admain.address
        ${yarn.resourcemanager.hostname}:8033


	yarn.nodemanager.local-dirs
        /data/hadoop/yarn/local


	yarn.log-aggregation-enable
        true


	yarn.nodemanager.remote-app-log-dir
        /data/tmp/logs


	yarn.log.server.url
        https://master:19888/jobhistory/logs/


	yarn.nodemanager.vmem-check-enabled
        /false


	yarn.nodemanager.aux-services
        mapreduce_shuffle


	yarn.nodemanager.aux-services.mapreduce.shuffle.class
        org.apache.hadoop.mapred.ShuffleHandler


三、拷贝hadoop包到子虚拟机

scp -r /usr/local/hadoop-2.6.0/ slalve1:/usr/local
scp -r /usr/local/hadoop-2.6.0/ slalve2:/usr/local
scp -r /usr/local/hadoop-2.6.0/ slalve3:/usr/local

四、格式化NameNode

/usr/local/hadoop-2.6.0/sbin/hdfs namenode -format

若出现Storage directory … has been successully formatted提示,则代表格式化成功
五、拷贝JDK包到每个子虚拟机

scp -r  /usr/local/jdk1.8.0_191 slalve1:/usr/local
scp -r  /usr/local/jdk1.8.0_191 slalve2:/usr/local
scp -r  /usr/local/jdk1.8.0_191 slalve3:/usr/local

六、启动Hadoop集群

/usr/local/hadoop-2.6.0/sbin/start-dfs.sh //启动HDFS相关服务
/usr/local/hadoop-2.6.0/sbin/start-yarn.sh //启动YARN相关服务
/usr/local/hadoop-2.6.0/sbin/mr-jobhistory-daemon.sh start historyserver //启动日志相关服务

七、关闭Hadoop集群

/usr/local/hadoop-2.6.0/sbin/stop-dfs.sh //启动HDFS相关服务
/usr/local/hadoop-2.6.0/sbin/stop-yarn.sh //启动YARN相关服务
/usr/local/hadoop-2.6.0/sbin/mr-jobhistory-daemon.sh stop historyserver //启动日志相关服务

八、监控Hadoop服务
用windows系统打开浏览器输入:
[虚拟机ip地址:50070] //这是NameNode服务的监视页面
[虚拟机ip地址:8088] //这是ResourceManager服务的监视页面
[虚拟机ip地址:19888] //这是Mapreduce JobHistory Server服务的监视页面
如果这些页面能正常打开并且可以查看到服务信息证明你所有的Hadoop配置已经完成啦!!恭喜成功!!接下来就去配置IDE吧!

你可能感兴趣的:(Hadoop)