大数据系列教程_hadoop集群安装(对hdfs及resourcemanager做了HA)

5、hadoop集群安装(对hdfs及resourcemanager做了HA)

备注:此处仅仅是列出了hadoop集群配置文件的配置,对于hadoop 环境变量的配置没有列出,可以再/etc/profile文件下自己配置


5.1、core-site.xml配置文件

 

 

<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://hadoopCluster</value>

  </property>

 

  <property>

    <name>hadoop.proxyuser.httpfs.hosts</name>

    <value>*</value>

  </property>

 

  <property>

    <name>hadoop.proxyuser.httpfs.groups</name>

    <value>*</value>

</property>

 

5.2、Hadoop-env.xml配置文件

export JAVA_HOME=/home/hadoop/cluster/jdk1.7.0_67

 

 

 

5.3、HDFS-SITE.XML配置文件

<configuration>

 

  <property>

    <name>dfs.nameservices</name>

    <value>hadoopCluster</value>

  </property>

 

  <property>

    <name>dfs.ha.namenodes.hadoopCluster</name>

    <value>master,masterHA</value>

  </property>

 

  <property>

    <name>dfs.namenode.rpc-address.hadoopCluster.master</name>

    <value>master:8020</value>

  </property>

 

  <property>

    <name>dfs.namenode.rpc-address.hadoopCluster.masterHA</name>

    <value>masterHA:8020</value>

  </property>

 

  <property>

    <name>dfs.namenode.http-address.hadoopCluster.master</name>

    <value>master:50070</value>

  </property>

 

  <property>

    <name>dfs.namenode.http-address.hadoopCluster.masterHA</name>

    <value>masterHA:50070</value>

  </property>

 

  <property>

    <name>dfs.namenode.shared.edits.dir</name>

    <value>qjournal://master:8485;masterHA:8485;node1:8485/hadoopCluster</value>

  </property>

 

  <property>

    <name>dfs.journalnode.edits.dir</name>

    <value>/mnt/hgfs/datadir/</value>

  </property>

 

  <property>

    <name>dfs.client.failover.proxy.provider.hadoopCluster</name>

    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

  </property>

 

  <property>

    <name>dfs.ha.fencing.methods</name>

    <value>sshfence</value>

  </property>

 

  <property>

    <name>dfs.ha.fencing.ssh.private-key-files</name>

    <value>/home/hadoop/.ssh/id_rsa</value>

  </property>

 

  <property>

    <name>dfs.ha.automatic-failover.enabled</name>

    <value>true</value>

  </property>

 

  <property>

    <name>ha.zookeeper.quorum</name>

    <value>zookeeper1:2181,zookeeper2:2181,zookeeper3:2181</value>

 

  </property>

 

  <property>

     <name>dfs.namenode.name.dir</name>

     <value>/home/hadoop/hadoop230chd501/tmp</value>

  </property>

 

  <property>

    <name>dfs.datanode.data.dir</name>

    <value>/home/hadoop/hadoop230chd501/data</value>

  </property>

 

  <property>

    <name>dfs.webhdfs.enabled</name>

    <value>true</value>

  </property>

 

  <property>

    <name>dfs.permissions.superusergroup</name>

    <value>hadoop</value>

  </property>

 

</configuration>

 

5.4、mapred-site.xml配置文件

<configuration>

 

<property>

  <name>mapreduce.framework.name</name>

    <value>yarn</value>

    </property>

 

</configuration>

 

5.5、Yarn-site.xml配置文件

<configuration>

 

<!-- Site specific YARN configuration properties -->

 

<property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

  </property>

  <!--

  <property>

    <name>yarn.resourcemanager.hostname</name>

    <value>master</value>

  </property>

-->

 

 

    <property> 

            <name>yarn.resourcemanager.connect.retry-interval.ms</name> 

            <value>2000</value> 

    </property> 

    <property> 

          <name>yarn.resourcemanager.ha.enabled</name> 

          <value>true</value> 

   </property> 

   <property>

         <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> 

         <value>true</value> 

   </property> 

                                                                         

    <property> 

            <name>yarn.resourcemanager.ha.automatic-failover.embedded</name> 

                    <value>true</value> 

                    </property> 

 

    <property> 

            <name>yarn.resourcemanager.cluster-id</name> 

                    <value>yarn-cluster</value> 

                        </property> 

                            <property> 

                                    <name>yarn.resourcemanager.ha.rm-ids</name> 

                                            <value>master,masterHA</value> 

                                                </property> 

 

    <property> 

            <name>yarn.resourcemanager.scheduler.class</name> 

                    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> 

                        </property> 

                            <property> 

                                    <name>yarn.resourcemanager.recovery.enabled</name> 

                                            <value>true</value> 

                                                </property> 

                                                    <property> 

                                                            <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name> 

                                                                    <value>5000</value> 

                                                                        </property> 

                                                                        

    <property> 

            <name>yarn.resourcemanager.store.class</name> 

                    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> 

                        </property> 

                            <property> 

                                    <name>yarn.resourcemanager.zk-address</name> 

                                            <value>zookeeper1:2181,zookeeper2:2181,zookeeper3:2181</value> 

                                                </property> 

                                                    <property> 

                                                            <name>yarn.resourcemanager.zk.state-store.address</name> 

                                                                    <value>zookeeper1:2181,zookeeper2:2181,zookeeper3:2181</value> 

                                                                        </property> 

                                                                            <property> 

                                                                                    <name>yarn.resourcemanager.address.master</name> 

                                                                                            <value>master:23140</value> 

                                                                                                </property> 

                                                                                                    <property> 

                                                                                                            <name>yarn.resourcemanager.address.masterHA</name> 

                                                                                                                    <value>masterHA:23140</value> 

                                                                                                                        </property> 

 

    <property> 

     <name>yarn.resourcemanager.scheduler.address.master</name> 

     <value>master:23130</value> 

        </property> 

         <property> 

        <name>yarn.resourcemanager.scheduler.address.masterHA</name> 

      <value>masterHA:23130</value> 

     </property> 

                                                     

    <property> 

            <name>yarn.resourcemanager.admin.address.master</name> 

     <value>master:23141</value> 

     </property> 

     <property> 

   <name>yarn.resourcemanager.admin.address.masterHA</name> 

    <value>masterHA:23141</value> 

    </property> 

                                                 

    <property> 

    <name>yarn.resourcemanager.resource-tracker.address.master</name> 

    <value>master:23125</value> 

   </property> 

   <property> 

                                    <name>yarn.resourcemanager.resource-tracker.address.masterHA</name> 

  <value>masterHA:23125</value> 

  </property> 

                                                     

    <property> 

    <name>yarn.resourcemanager.webapp.address.master</name> 

       <value>master:23188</value> 

    </property> 

   <property> 

    <name>yarn.resourcemanager.webapp.address.masterHA</name> 

      <value>masterHA:23188</value> 

   </property> 

   <property> 

                                                            <name>yarn.resourcemanager.webapp.https.address.master</name> 

                                                                    <value>master:23189</value> 

  </property> 

  <property> 

                                                                                    <name>yarn.resourcemanager.webapp.https.address.masterHA</name> 

                                                                                            <value>masterHA:23189</value> 

                                                                                                </property> 

 

  <property>

    <name>yarn.log.aggregation.enable</name>

    <value>true</value>

  </property>

 

  <property>

    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

    <value>org.apache.hadoop.mapred.ShuffleHandler</value>

  </property>

 

 

</configuration>

5.6、slaver.xml

node1

node2

node3

5.7 hadoop 启动

初始化操作:

hdfs zkfc -formatZK

启动journalnode:奇数台

hadoop-daemon.sh start journalnode

active master

切换到hdfs用户下对NameNode格式化:

hadoop namenode -format

初始化Shared Edits directory

hdfs namenode -initializeSharedEdits

启动NameNode

hadoop-daemon.sh start namenode

standby master:

hdfs namenode -bootstrapStandby

hadoop-daemon.sh start namenode

启动DataNode

hadoop-daemon.sh start datanode

启动ZKFC

hadoop-daemon.sh start zkfc (两台master上面都要启动)

YARN 开启

Active:./start-yarn.sh

Standby: ./yarn-daemon.sh start resourcemanager

 

集群开启

./start-all.sh

masterHA机器

./yarn-daemon.sh start resourcemanager

 

5.8、增加datanode节点

将新增加的datanode hostname 加入到slaves中

./hadoop-daemon.sh start datanode即可

最后在集群上balance下就好

 

5.9、删除datanode节点

 

环境为Hadoop 2.2 with namenode ha,3台dn。
今天想把一台dn去集群中剔除掉。
配置了一下文件
1- hdfs-site.xml中增加了如下条目
<property>
  <name>dfs.hosts.exclude</name>
  <value>/usr/local/hadoop-2.2.0/etc/hadoop/exclude</value>
</property>  
2- 创建exclude,其中包含了主机名。
hsrv03
3- 把hdfs-site.xml和exclude文件scp到所有机器的对于目录下
4- 在nn上执行 hdfs dfsadmin -refreshNodes

 

你可能感兴趣的:(大数据,hdfs,yarn,大数据系列教程)