Linux搭建Hadoop开发环境

阅读更多
 Linux搭建Hadoop开发环境
Hadoop环境搭建安装配置:
[1].官网下载Hadoop-2.7.5安装包: hadoop-2.7.5/hadoop-2.7.5.tar.gz
[2].把Hadoop-2.7.5安装包利用Xftp5工具上传到:/usr/local/hadoop
[3].登录Liunx服务器,利用Xhell5进入:cd /usr/local/hadoop:
[root@marklin hadoop]# cd /usr/local/hadoop
       [root@marklin hadoop]#
       并使用tar -xvf 解压:tar -xvf hadoop-2.7.5.tar.gz,
[root@marklin hadoop]# tar -xvf hadoop-2.7.5.tar.gz
[4].配置Hadoop环境变量,输入:vim /etc/profile
     #Setting HADOOP_HOME PATH
    export HADOOP_HOME=/usr/local/hadoop/hadoop-2.7.5
    export PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin::${HADOOP_HOME}/lib
    export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
    export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
    export HADOOP_MAPARED_HOME=${HADOOP_HOME}
    export HADOOP_COMMON_HOME=${HADOOP_HOME}
    export HADOOP_HDFS_HOME=${HADOOP_HOME}
    export YARN_HOME=${HADOOP_HOME}
    export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
    export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
    export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
保存配置,输入:source /etc/profile
[root@marklin ~]# source /etc/profile
       [root@marklin ~]#
 PS:最总要的2个点:
【1】修改主机名称:vim  /etc/hostname 
【2】修改配置主机与Ip地址的映射:vim  /etc/hosts 
[5].Hadoop修改配置文件:
core-site.xml:Hadoop的核心配置,包含tmp临时配置文件和访问地址,默认端口9000
mapred-site.xml:Hadoop中相关数据处理模型的配置处理
yarn-site.xml:Hadoop中相关Job的配置处理
hdfs-site.xml:Hadoop配置的文件备份个数以及数据文件夹的配置
 
(1). 配置core-site.xml,在Hadoop安装目录[/usr/local/hadoop/hadoop-2.7.5/etc/hadoop]下 输入: vim core-site.xml
[root@marklin ~]# cd /usr/local/hadoop/hadoop-2.7.5/etc/hadoop
 
[root@marklin hadoop]#
输入:vim core-site.xml
并配置:
    
        fs.default.name
         hdfs://marklin.com:9000
    
    
        hadoop.tmp.dir
        /usr/local/hadoop/repository/hdfs/tmp
    
    
        io.file.buffer.size
        131702
    
      
       hadoop.proxyuser.hadoop.hosts  
       *  
      
      
       hadoop.proxyuser.hadoop.groups  
       *  
      
 
同时在文件路径:/usr/local/hadoop/repository/hdfs,创建tmp目录: mkdir tmp
(2) 修改 hdfs-site.xml,并配置:vim hdfs-site.xml
[root@marklin hadoop]# vim hdfs-site.xml
[root@marklin hadoop]# 
    
        dfs.namenode.name.dir --dfs.namenode.name.dir定义名称节点路径
        /usr/local/hadoop/repository/hdfs/name
        true
    
    
        dfs.datanode.data.dir --dfs.datanode.data.dir定义数据节点路径
        /usr/local/hadoop/repository/hdfs/data
        true
    
    
        dfs.permissions --dfs.permissions定义权限认证
        false
    
    
        dfs.replication--dfs.replication定义文件副本数
        1
    
   
        dfs.namenode.http-address--dfs.namenode.http-address定义服务http的访问
        marklin.com:50070
    
    
        dfs.namenode.secondary.http-address--dfs.namenode.secondary.http-address定义服务http的访问
        marklin.com:50090
    
    
        dfs.webhdfs.enabled
        true
    
    
        dfs.namenode.name.dir
        /usr/local/hadoop/repository/hdfs/name
        true
    
    
        dfs.datanode.data.dir
        /usr/local/hadoop/repository/hdfs/data
        true
    
    
        dfs.replication
        2
    
    
        dfs.namenode.http-address
         marklin.com:50070
    
    
        dfs.namenode.secondary.http-address
         marklin.com:50090
    
    
        dfs.webhdfs.enabled
        true
    
    
        dfs.permissions
        false
    
同时在文件路径:/usr/local/hadoop/repository/hdfs,创建name和data目录: mkdir  name 和mkdir  data
(3) 创建mapred-site.xml文件,输入:cp mapred-site.xml.template mapred-site.xml
[root@marklin hadoop]# cp mapred-site.xml.template mapred-site.xml
编辑mapred-site.xml文件,并配置:
    
        mapreduce.framework.name
        yarn
    
    
        mapred.job.tracker
         hdfs://marklin.com:8021/
    
    
        mapreduce.jobhistory.address
         marklin.com:10020
    
    
        mapreduce.jobhistory.webapp.address
         marklin.com:19888
    
    
        mapreduce.reduce.java.opts
        -Xms2000m -Xmx4600m
    
    
        mapreduce.map.memory.mb
        5120
    
    
        mapreduce.reduce.input.buffer.percent
        0.5
    
    
        mapreduce.reduce.memory.mb
        2048
    
    
        mapred.tasktracker.reduce.tasks.maximum
        2
    
    
        mapred.system.dir
        /usr/local/hadoop/repository/mapreduce/system
        true
    
    
        mapred.local.dir
        /usr/local/hadoop/repository/mapreduce/local
        true
    
 
(4) 修改 yarn-site.xml,并输入::vim yarn-site.xml
[root@marklin hadoop]# vim yarn-site.xml
[root@marklin hadoop]#
并配置:
    
        yarn.nodemanager.aux-services
        mapreduce_shuffle
    
    
        yarn.nodemanager.auxservices.mapreduce.shuffle.class
        org.apache.hadoop.mapred.ShuffleHandler
    
    
        yarn.resourcemanager.hostname
         marklin.com
    
    
        yarn.resourcemanager.address
        ${yarn.resourcemanager.hostname}:8032
    
    
        yarn.resourcemanager.scheduler.address
        ${yarn.resourcemanager.hostname}:8030
    
    
        yarn.resourcemanager.resource-tracker.address
        ${yarn.resourcemanager.hostname}:8031
    
    
        yarn.resourcemanager.admin.address
        ${yarn.resourcemanager.hostname}:8033
    
    
        yarn.resourcemanager.webapp.address
        ${yarn.resourcemanager.hostname}:8088
    
    
        yarn.nodemanager.resource.memory-mb
        1024
    
    
        yarn.app.mapreduce.am.staging-dir
        /usr/local/hadoop/repository/mapreduce/staging
    
    
        mapreduce.jobhistory.intermediate-done-dir
        ${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate
    
    
        mapreduce.jobhistory.done-dir
        ${yarn.app.mapreduce.am.staging-dir}/history/done
    
 
 
【6】在Hadoop文件目录[/usr/local/hadoop/hadoop-2.7.5/etc/hadoop]下,
对应的 hadoop-env.sh,mapred-env.sh以及yarn-env.sh文件配置JAVA_HOME:export JAVA_HOME=/usr/local/java/jdk1.8.0_162
输入:vim hadoop-env.sh :
[root@marklin hadoop]# vim hadoop-env.sh
[root@marklin hadoop]#
export JAVA_HOME=/usr/local/java/jdk1.8.0_162
输入:vim mapred-env.sh:
export JAVA_HOME=/usr/local/java/jdk1.8.0_162
[root@marklin hadoop]# vim mapred-env.sh
[root@marklin hadoop]#
输入:vim yarn-env.sh
export JAVA_HOME=/usr/local/java/jdk1.8.0_162
[root@marklin hadoop]# vim yarn-env.sh
[root@marklin hadoop]#
 
【6】开放端口:50070
(1)启动防火墙:systemctl start firewalld.service
[root@marklin ~]# systemctl start firewalld.service
[root@marklin ~]#
(2)启动防火墙:firewall-cmd --zone=public --add-port=50070/tcp --permanent
[root@marklin ~]# firewall-cmd --zone=public --add-port=50070/tcp --permanent
[root@marklin ~]#
(3)启动:firewall-cmd --reload
[root@marklin ~]# firewall-cmd --reload
[root@marklin ~]# 
(4)格式化:hdfs namenode -format
[root@marklin ~]# hdfs namenode -format
[root@marklin ~]#
(5)启动脚本:start-all.sh
[root@marklin ~]# start-all.sh
[root@marklin ~]#
 
[root@marklin ~]# start-dfs.sh
Starting namenodes on [ marklin.com]
marklin.com: starting namenode, logging to /usr/local/hadoop/hadoop-2.7.5/logs/ hadoop-root-namenode-marklin.com.out
marklin.com: starting datanode, logging to /usr/local/hadoop/hadoop-2.7.5/logs/ hadoop-root-datanode-marklin.com.out
Starting secondary namenodes [ marklin.com]
marklin.com: starting secondarynamenode, logging to /usr/local/hadoop/hadoop-2.7.5/logs/ hadoop-root-secondarynamenode-marklin.com.out
 
 
[root@marklin ~]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/hadoop-2.7.5/logs/ yarn-root-resourcemanager-marklin.com.out
marklin.com: starting nodemanager, logging to /usr/local/hadoop/hadoop-2.7.5/logs/ yarn-root-nodemanager-marklin.com.out
 
 
[root@marklin ~]# jps
1122 QuorumPeerMain
6034 Jps
1043 QuorumPeerMain
5413 SecondaryNameNode
5580 ResourceManager
5085 NameNode
5709 NodeManager
5230 DataNode
1119 QuorumPeerMain
 
【7】输入测试地址:
【1】浏览器输入:http://192.168.3.4:50070/dfshealth.html#tab-overview
【2】浏览器输入:http://192.168.3.4:8088/cluster
  • Linux搭建Hadoop开发环境.pdf (1.4 MB)
  • 下载次数: 0

你可能感兴趣的:(Hadoop,大数据环境搭建,Liunx)