Hadoop2.2.0(yarn)编译部署手册

Created on 2014-3-30
URL : http://www.cnblogs.com/zhxfl/p/3633919.html
@author: zhxfl

 

 

Hadoop-2.2编译

由于Hadoop-2.2只发布了32位的版本,所以如果是在64位操作系统上,需要重新编译

安装 maven

安装maven,查找其安装目录

sudo apt-get intall maven

Find /usr -name “*maven*” 

根据其安装目录添加环境变量

export M2_HOME=/usr/shared/maven

export PATH=$PATH:$M2_HOME/bin 

export MAVEN_OPTS="-Xms256m -Xmx512m"

安装 google protobuf

wget https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz



/configure--prefix=/usr/local/protobuf   

sudo make && sudo make install

sudo vim /etc/ld.so.conf   [添加/usr/local/lib]

ldconfig

安装Cmake

sudo apt-get install cmake

安装依赖库

libglib2.0-dev libssl-dev

pom.xml 中有个bug,添加下面patch即可

参考 https://issues.apache.org/jira/browse/HADOOP-10110

 

Index: hadoop-common-project/hadoop-auth/pom.xml



===================================================================



--- hadoop-common-project/hadoop-auth/pom.xml     (revision 1543124)



+++ hadoop-common-project/hadoop-auth/pom.xml   (working copy)



@@ -54,6 +54,11 @@



     </dependency>



     <dependency>



       <groupId>org.mortbay.jetty</groupId>



+      <artifactId>jetty-util</artifactId>



+      <scope>test</scope>



+    </dependency>



+    <dependency>



+      <groupId>org.mortbay.jetty</groupId>



       <artifactId>jetty</artifactId>



       <scope>test</scope>



     </dependency>

开始编译:

mvn package -Pdist,native -DskipTests -Dtar 

 

常见错误

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-hdfs: An Ant BuildException has occured: exec returned: 1 -> [Help 1]

安装libglib2.0-dev

 

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 1 -> [Help 1]

安装libssl-dev

 

[ERROR] /home/yarn/hadoop-2.2.0-src/hadoop-common-project/hadoop-auth/src/test/j                                                                                        ava/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java:                                                                                        [86,13] cannot access org.mortbay.component.AbstractLifeCycle

 

参考 https://issues.apache.org/jira/browse/HADOOP-10110

 

最后在目录~/hadoop-2.2.0-src/hadoop-dist/target中有一个hadoop-2.2.0的目录就是编译出来的版本

 

Hadoop-2.2环境配置

添加用户

每个节点都添加yarn用户

 

添加用户

sudo adduser yarn

 

把用户添加到hadoop组中(如果你没有一个hadoop的组,需要新建这个组)

sudo gpasswd -a yarn hadoop

 

给yarn用户添加sudo权限

sudo vim /etc/sudoers

 

添加如下语句

yarn ALL=(ALL:ALL) ALL

 

ssh配置

针对master

sudo apt-get install openssh-server

ssh-keygen(一直按enter即可)

在~/.ssh目录下,有一个id_rsa(私钥),一个id_rsa.pub(公钥)

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

输入ssh localhost确认是否需要输入密码

 

拷贝authorized_keys 到slave1~slave3节点

 

scp authorized_keys yarn@slave1:~/.ssh/

scp authorized_keys yarn@slave2:~/.ssh/

scp authorized_keys yarn@slave3:~/.ssh/

 

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/

针对salves

都要执行ssh-keygen

最后在master节点上使用ssh slave1等测试

 

其他

每个节点上面都需要添加如下ip

vim /etc/hosts

219.219.216.48 master

219.219.216.47 slave1

219.219.216.45 slave2

219.219.216.46 slave3

参考

http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-install/

 

配置文件

hadoop-env.sh

添加JAVA_HOME环境变量

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/

Core-site.xml

 

<configuration>

        <property>

                <name>fs.defaultFS</name>

                <value>hdfs://master:8020</value>

                <final>true</final>

        </property>

        <property>

                <name>dfs.replication</name>

                <value>2</value>

        </property>

        <property>

                <name>hadoop.tmp.dir</name>

                <value>/home/yarn/hadoop-files/tmp</value>

        </property>

</configuration>

Hdfs-site.xml

<configuration>

        <property>

        <name>dfs.namenode.name.dir</name>

                <value>/home/yarn/hadoop-files/name</value>

        </property>

        <property>

                <name>dfs.datanode.data.dir</name>

                <value>/home/yarn/hadoop-files/dfs/data</value>

        </property>

        <property>

                <name>dfs.replication</name>

                <value>2</value>

        </property>

        <property>

                <name>hadoop.tmp.dir</name>

                <value>/home/yarn/hadoop-files/tmp/</value>

                <description>A base for other temporary directories.</description>

        </property>

</configuration>

红色部分根据你实际的目录而定。

 

mapred-site.xml

<configuration>

        <property>

                <name>mapreduce.framework.name</name>

                <value>yarn</value>

        </property>

        <property>

                <name>mapred.child.java.opts</name>

                <value>-Xmx1024m</value>

        </property>

</configuration>

yarn-site.xml

<configuration>

   <property>

     <name>yarn.nodemanager.aux-services</name>

     <value>mapreduce_shuffle</value>

  </property>

  <property>

     <name>yarn.resourcemanager.address</name>

     <value>master:8032</value>

  </property>

  <property>

      <name>yarn.resourcemanager.resource-tracker.address</name>

      <value>master:8031</value>

  </property>

  <property>

      <name>yarn.resourcemanager.admin.address</name>

      <value>master:8033</value>

  </property>

  <property>

      <name>yarn.resourcemanager.scheduler.address</name>

      <value>master:8030</value>

  </property>



  <property>

      <name>yarn.web-proxy.address</name>

      <value>master:8888</value>

  </property>

  <property>

     <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

     <value>org.apache.hadoop.mapred.ShuffleHandler</value>

  </property>

  <property>

     <name>yarn.nodemanager.loacl-dirs</name>

     <value>/home/yarn/hadoop-2.2.0/hadoop-files/hadoop-loacl-dirs/</value>

     <final>true</final>

  </property>

</configuration>

 

 

其他可能需要设置的环境变量

export M2_HOME=/usr/share/maven

export PATH=$PATH:$M2_HOME/bin:~/hadoop-2.2.0/bin

export MAVEN_OPTS="-Xms2048m -Xmx2048m"

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/

export HADOOP_HOME="/home/yarn/hadoop-2.2.0"

export HADOOP_PREFIX="/home/yarn/hadoop-2.2.0"

export YARN_HOME=$HADOOP_HOME

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop/

export YARN_CONF_DIR=$HADOOP_CONF_DIR

export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

export SCALA_HOME=/usr/share/scala/

export PATH=$SCALA_HOME/bin/:$PATH

 

测试

改变文件用户组属性

./hdfs dfs -chgrp -R yarn

./hdfs dfs -mkdir /yarn

这样yarn文件夹就属于用户yarn的了

编译和提交任务错误参考:

http://www.cnblogs.com/lucius/p/3435296.html

 

 

你可能感兴趣的:(hadoop2)