Hadoop3.1.2伪分布式配置

1 配置jiava环境

  • 查看安装的java版本
rpm -qa|grep java
  • 查看java安装目录
which java
  • 配置hadoop-env.sh中的JAVA_HOME,复制上一条命令的结果,粘贴到JAVA_HOME中,把连带bin之后的字符删除

2 配置hadoop四大模块:common,hdfs,yarn,mapreduce

  • 配置common,core-site.xml

    
        fs.defaultFS
        hdfs://localhost:8020
    
  #configure temp directory
    
        hadoop.tmp.dir
        /opt/module/hadoop-3.1.2/data/tmp
    

  • 配置hdfs,hdfs-site.xml
    配置副本数
    
        dfs.replication
        1
    

3 启动文件系统

  • 格式化文件系统
bin/hdfs namenode -format
  • 启动namenode,datanode,secondarynamenode
sbin/start-dfs.sh
jps #查看启动的进程
9809 SecondaryNameNode
3752 DataNode
4171 Jps
3647 NameNode

netstat -ntlp 查看进程占用的端口

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:9864            0.0.0.0:*               LISTEN      3752/java
tcp        0      0 0.0.0.0:9866            0.0.0.0:*               LISTEN      3752/java
tcp        0      0 0.0.0.0:9867            0.0.0.0:*               LISTEN      3752/java
tcp        0      0 0.0.0.0:9868            0.0.0.0:*               LISTEN      3922/java
tcp        0      0 0.0.0.0:9870            0.0.0.0:*               LISTEN      3647/java
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      -
tcp        0      0 192.168.1.8:8020        0.0.0.0:*               LISTEN      3647/java
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.1:33530         0.0.0.0:*               LISTEN      3752/java
  • 查看namenode状态
    http://ip:9870/
    namenode view
  • 创建用户
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/
  • 在分布式系统上创建输入文件夹
$ bin/hdfs dfs -put etc/hadoop input
  • 运行实例
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar grep input output 'dfs[a-z.]+'

-获取, 查看日志

$ bin/hdfs dfs -get output output
$ cat output/*
  • 在浏览器上查看


    Browse the file system

    下载日志

4 启动yarn

  • 配置mapreduce,mapred-site.xml
  • 配置yarn,yarn-site.xml
    
        yarn.nodemanager.aux-services
        mapreduce_shuffle
    
    
        yarn.nodemanager.env-whitelist
        JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
    
  • 配置mapreduce,mapred-site.xml
    
        mapreduce.framework.name
        yarn
    
    
        mapreduce.application.classpath
        $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
    
  • 启动
sbin/start-yarn.sh
jps #查看进程
3922 SecondaryNameNode
6276 Jps
3752 DataNode
6056 ResourceManager
3647 NameNode
6191 NodeManager
  • 查看集群管理
    [http://192.168.1.8:8088/cluster]

    管理

  • 执行mapreduce实例

  • 错误
    Error:/bin/bash: /bin/java: No such file or directory
    /bin/java不存在,则创建

sudo ln -s /opt/module/jdk1.8.0_162/bin/java /bin/java
  • 关闭
sbin/stop-dfs.sh
sbin/stop-yarn.sh

5 配置

  • 指定namenode主机: core-site.xml
    
        fs.defaultFS
        hdfs://hadoop-yarn.cloudyhadoop.com:8020
    
  • 指定datanode主机:worker
  • 指定secondarynamenode主机:hdfs-site.xml

  dfs.namenode.secondary.http-address
  0.0.0.0:9868
  
    The secondary namenode http server address and port.
  

  • 指定resourcemanager主机名:yarn-site.xml
  
    The hostname of the RM.
    yarn.resourcemanager.hostname
    0.0.0.0
      
  • 指定nodemanager主机名:yarn-site.xml
  
    The hostname of the NM.
    yarn.nodemanager.hostname
    0.0.0.0
  
  • 指定historyserver主机名:mapred-site.xml
  mapreduce.jobhistory.admin.address
  0.0.0.0:10033
  The address of the History server admin interface.

你可能感兴趣的:(Hadoop3.1.2伪分布式配置)