Linux巩固记录 hadoop 2.7.4 环境搭建

阅读更多

由于要近期使用hadoop等进行相关任务执行,操作linux时候就多了

以前只在linux上配置J2EE项目执行环境,无非配置下jdk,部署tomcat,再通过docker或者jenkins自动部署上去

看下进程,复制粘贴删除等基本操作,很多东西久了不用就忘了,所有写个demo巩固下曾经的linux知识

后续会有hadoop等主流的大数据相关环境的搭建及使用

 

---------------------------------------------------------------------------------------------------------------------------------------------------------

这次讲hadoop 2.7.4环境搭建

本次需要三个节点 操作用户均为root

192.168.0.80 master
192.168.0.81 slave1
192.168.0.82 slave2

1.按照 Linux巩固记录(1) J2EE开发环境搭建及网络配置  配置好三台虚拟机的网络和jdk 并能互通(都关掉防火墙)

2.更改80虚拟机hostname为master,81为slave1,82为slave2

 vi /etc/sysconfig/network

 以80为例:删除localhost  增加  HOSTNAME=master  

3.修改三台虚拟机的hosts, 三台虚拟机一样

 vi /etc/hosts

192.168.0.80 master
192.168.0.81 slave1
192.168.0.82 slave2

4.修改sshd配置

vi /etc/ssh/sshd_config

#放开注释
RSAAuthentication yes
PubkeyAuthentication yes

5.三台虚拟机全部重启   shutdown -r now

 

--------------------------------------------------------------

6.ssh key配置,

cd ~/.ssh #(.ssh是目录,如果没有,执行$ ssh xxxxxx)

#master
ssh master
ssh-keygen –t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

scp -r root@slave1:~/.ssh/id_rsa.pub slave1.pub
scp -r root@slave2:~/.ssh/id_rsa.pub slave2.pub
cat ~/.ssh/slave2.pub >> ~/.ssh/authorized_keys 
cat ~/.ssh/slave1.pub >> ~/.ssh/authorized_keys

#slave1
ssh slave1
ssh-keygen –t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
scp -r root@master:~/.ssh/id_rsa.pub master.pub
scp -r root@slave2:~/.ssh/id_rsa.pub slave2.pub
cat ~/.ssh/slave2.pub >> ~/.ssh/authorized_keys 
cat ~/.ssh/master.pub >> ~/.ssh/authorized_keys

#slave2
ssh slave2
ssh-keygen –t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
scp -r root@master:~/.ssh/id_rsa.pub master.pub
scp -r root@slave1:~/.ssh/id_rsa.pub slave1.pub
cat ~/.ssh/slave1.pub >> ~/.ssh/authorized_keys 
cat ~/.ssh/master.pub >> ~/.ssh/authorized_keys

配置完毕后可以无密码登录   如master中到salve1  ssh slave1

[root@master /]# ssh slave1
Last login: Wed Aug 30 21:34:51 2017 from slave2
[root@slave1 ~]# 

 

hadoop配置只需要在master上进行,配置完成后复制到slave上即可

7. 下载hadoop 2.7.4压缩包到master /home下并解压 重命名为 hadoop-2.7.4   tar -xzvf   xxxxxx  /home/hadoop-2.7.4

 并设置hadoop环境变量

 vi /etc/profile

export HADOOP_HOME=/home/hadoop-2.7.4
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

 

8.

vi /home/hadoop-2.7.4/etc/hadoop/hadoop-env.sh 设置JAVA_HOME
vi /home/hadoop-2.7.4/etc/hadoop/mapred-env.sh 设置JAVA_HOME

 

9 修改 /home/hadoop-2.7.4/etc/hadoop/core-site.xml

<configuration>
       <property>
                <name>fs.defaultFSname>
                <value>hdfs://master:9000value>
             <description>设定namenode的主机名及端口(建议不要更改端口号)description>
       property>
       <property>
                <name>io.file.buffer.sizename>
                <value>131072value>
        <description> 设置缓存大小 description>
        property>
       <property>
               <name>hadoop.tmp.dirname>
               <value>file:/home/hadoop-2.7.4/tmpvalue>
               <description> 存放临时文件的目录 description>
       property>
       
       <property>
            <name>hadoop.security.authorizationname>
            <value>falsevalue>
        property>
configuration>

10 修改 /home/hadoop-2.7.4/etc/hadoop/hdfs-site.xml

<configuration>
    <property>
        <name>dfs.namenode.name.dirname>
        <value>file:/home/hadoop-2.7.4/hdfs/namevalue>
        <description> namenode 用来持续存放命名空间和交换日志的本地文件系统路径 description> 
    property>
    <property>
        <name>dfs.datanode.data.dirname>
        <value>file:/home/hadoop-2.7.4/hdfs/datavalue>
        <description> DataNode 在本地存放块文件的目录列表,用逗号分隔 description> 
    property>
    <property>
        <name>dfs.replicationname>
        <value>2value>
        <description> 设定 HDFS 存储文件的副本个数,默认为3 description>
    property>
    <property>
        <name>dfs.webhdfs.enabledname>
        <value>truevalue>
    property>
    <property>
        <name>dfs.permissionsname>
        <value>falsevalue>
    property>
configuration>

11 修改 /home/hadoop-2.7.4/etc/hadoop/mapred-site.xml

<configuration>  
    <property>
        <name>mapreduce.framework.namename>
                <value>yarnvalue>
                <final>truefinal>
        property>
    <property>
        <name>mapreduce.jobtracker.http.addressname>
        <value>master:50030value>
    property>
    <property>
            <name>mapreduce.jobhistory.addressname>
            <value>master:10020value>
    property>
    <property>
            <name>mapreduce.jobhistory.webapp.addressname>
            <value>master:19888value>
    property>
        <property>
                <name>mapred.job.trackername>
                <value>http://master:9001value>
        property>
configuration>

 

12 修改 /home/hadoop-2.7.4/etc/hadoop/yarn-site.xml

<configuration>
        <property>
               <name>yarn.nodemanager.aux-servicesname>
               <value>mapreduce_shufflevalue>
        property>
        <property>                                                                
                <name>yarn.nodemanager.aux-services.mapreduce.shuffle.classname>
               <value>org.apache.hadoop.mapred.ShuffleHandlervalue>
        property>
        <property>
               <name>yarn.resourcemanager.addressname>
               <value>master:8032value>
       property>
       <property>
               <name>yarn.resourcemanager.scheduler.addressname>
               <value>master:8030value>
       property>
       <property>
            <name>yarn.resourcemanager.resource-tracker.addressname>
             <value>master:8031value>
      property>
      <property>
              <name>yarn.resourcemanager.admin.addressname>
               <value>master:8033value>
       property>
       <property>
               <name>yarn.resourcemanager.webapp.addressname>
               <value>master:8088value>
       property>
configuration>

 

13 创建对应的文件夹   mkdir -p logs (其实可以先创建好了文件夹再复制,文件夹多了不影响)

在每个节点上创建数据存储目录/home/hadoop-2.7.4/hdfs 用来存放集群数据。
在主节点node上创建目录/home/hadoop-2.7.4/hdfs/name 用来存放文件系统元数据。
在每个从节点上创建目录/home/hadoop-2.7.4/hdfs/data 用来存放真正的数据。
所有节点上的日志目录为/home/hadoop-2.7.4/logs
所有节点上的临时目录为/home/hadoop-2.7.4/tmp

 

14复制配置好的配置到slave节点

scp -r /home/hadoop-2.7.4 root@slave1:/home/hadoop-2.7.4
scp -r /home/hadoop-2.7.4 root@slave2:/home/hadoop-2.7.4

 

15 在master节点上配置hadoop salve配置文件 增加节点

vi /home/hadoop-2.7.4/etc/hadoop/slaves

增加

salve1

slave2

 

16格式化namenode和datanode并启动,(在master上执行就可以了 不需要在slave上执行)

/home/hadoop-2.7.4/bin/hadoop namenode -format
/home/hadoop-2.7.4/bin/hadoop datanode -format
/home/hadoop-2.7.4/sbin/start-all.sh

 

17 通过jps命令查看是否启动成功

[root@master ~]# ssh master 
Last login: Sat Sep  2 00:47:50 2017 from slave1
[root@master ~]# jps
9187 Jps
3221 ResourceManager
3062 SecondaryNameNode
2856 NameNode
[root@master ~]# ssh slave1
Last login: Sat Sep  2 00:25:55 2017 from master
[root@slave1 ~]# jps
6044 Jps
2685 NodeManager
2590 DataNode
[root@slave1 ~]# ssh slave2
Last login: Wed Aug 30 21:34:38 2017 from master
j[root@slave2 ~]# jps
2679 NodeManager
5994 Jps
2590 DataNode
[root@slave2 ~]# 

如果启动异常,一定要仔细看log并修正配置

你可能感兴趣的:(Linux巩固记录 hadoop 2.7.4 环境搭建)