前面的文章讲解了hadoop-2.2.0在64位linux系统下的编译步骤,以及zookeeper集群的部署,这篇主要写hadoop集群的部署,因为关于这方面的资料比较多这里就写主要的步骤:

1:主机对应关系

centos01
192.168.6.180(master
centos02
192.168.6.181(slave
centos03
192.168.6.182(slave
centos04
192.168.6.183(slave
centos05
192.168.6.184(slave
centos06
192.168.6.185(slave

2:主要配置文件:

  • core-site.xml








 
        fs.default.name
        hdfs://192.168.6.180:9000
    
  
    hadoop.tmp.dir
    /data/soft/hadoop/hadoop-2.2.0/tmp
  
  •  hdfs-site.xml









   dfs.replication
   1
 

 
   dfs.namenode.name.dir
   /data/soft/hadoop/hadoop-2.2.0/nddir
 


 
   dfs.datanode.data.dir
   /data/soft/hadoop/hadoop-2.2.0/dddir
 


  dfs.permissions
  false

  • mapred-site.xml









mapred.job.tracker
centos01:8021
true
The host and port that the MapReduce JobTracker runs at. 


    mapreduce.cluster.temp.dir
    
    No description
    true
  

  
    mapreduce.cluster.local.dir
    
    No description
    true
  
  • yarn-site.xml





yarn.nodemanager.aux-services
mapreduce_shuffle



yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler



Yarn.nodemanager.aux-services
mapreduce.shuffle



yarn.resourcemanager.address
centos01:8032



yarn.resourcemanager.scheduler.address
centos01:8030



yarn.resourcemanager.resource-tracker.address
centos01:8031



yarn.resourcemanager.admin.address
centos01:8033



yarn.resourcemanager.webapp.address
centos01:8088


  • slaves

centos02
centos03
centos04
centos05
centos06

3:准备工作

  •  配置SSH免登陆(ssh-keygen -t rsa  /  ssh-coyp-id -i itcast0x)

  •  配置java环境变量

  •  修改上面的配置文件

  •  分发到从机 scp -r xxx centos0x:/xxx

  •  格式化namenode,hadoop namenode -format

  •  启动集群sbin/start-all.sh


4:结束

  • 上传一个文件:hadoop fs -put /etc/profile /data

  • 启动测试jar:

    hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /data /out

  • 查看结果:    http://192.168.6.180:50070

  • 查看集群界面:   http://192.168.6.180:8088