hadoop搭建

Hadoop安装

  • 安装SSH 

    安装OpenSSH首先要安装client和server,

       命令为: 

sudo apt-get install openssh-server

     但是在安装过程可能会出现没有发现client的错误,如下:

 openssh-server : Depends: openssh-client (= 1:6.6p1-2ubuntu1)

     则使用命令安装client:

sudo apt-get install openssh-client=1:6.6p1-2ubuntu1

     安装ssh:

sudo apt-get install ssh

  • 配置无密码登陆

    进入ssh文件夹(/home/××/.ssh),如果没有的话可以手动创建一个。

    输入如下,然后一路回车:

ssh-keygen -t rsa

    将id_rsa.pub追加到authorized_keys授权文件中:

cat id_rsa.pub >> authorized_keys

    重启SSH使配置生效:

service ssh restart

    验证是否成功:

ssh localhost ;exit

    上一步ssh到本地应该不需要输入密码,如果仍要求输入密码,则修改.ssh权限:

chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
  • 关闭防火墙

    查看防火墙状态,关闭防火墙:

sudo ufw status
sudo ufw disable(重启生效)
  • 配置以root身份用secureCRT登录
vi /etc/ssh/sshd_config # 编辑此文档
cat /etc/ssh/sshd_config |grep PermitRootLogin #查看是否允许root登陆


  • 三台机器分别设置host和主机名
/etc/hosts 和/etc/hostname


  • 创建相应namenode和datanode目录:
mkdir -p ~/dfs/name
mkdir -p ~/dfs/data
mkdir -p ~/temp


  • 解压Hadoop文件
  • 配置环境变量:hadoop-2.2.0/etc/hadoop/hadoop-env.sh
修改export JAVA_HOME=/home/vmworker01/software/jdk1.7.0_79


  • 配置yarn环境变量:hadoop-2.2.0/etc/hadoop/yarn-env.sh
修改export JAVA_HOME=/home/vmworker01/software/jdk1.7.0_79


  • 配置hadoop-2.2.0/etc/hadoop/slaves(这里保存所有Slave节点)
slave01
slave02


  • 配置:hadoop-2.2.0/etc/hadoop/core-site.xml
<configuration>
               <property>
                        <name>fs.defaultFS</name>
                        <value>hdfs://master:9000</value>
               </property>
               <property>
                        <name>io.file.buffer.size</name>
                        <value>131072</value>
                </property>
               <property>
                       <name>hadoop.tmp.dir</name>
                       <value>file:/home/vmworker01/tmp</value>
                       <description>Abase for other temporary directories.</description>
               </property>
                <property>
                       <name>hadoop.proxyuser.vmworker01.hosts</name>
                       <value>*</value>
               </property>
               <property>
                       <name>hadoop.proxyuser.vmworker01.groups</name>
                       <value>*</value>
               </property>
</configuration>


  • ~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
<configuration>
             <property>
                     <name>dfs.namenode.secondary.http-address</name>
                     <value>master:9001</value>
             </property>
             <property>
                     <name>dfs.namenode.name.dir</name>
                     <value>file:/home/vmworker01/dfs/name</value>
              </property>
               <property>
                      <name>dfs.datanode.data.dir</name>
                      <value>file:/home/vmworker01/dfs/data</value>
               </property>
               <property>
                       <name>dfs.replication</name>
                       <value>2</value>
                </property>
                <property>
                         <name>dfs.webhdfs.enabled</name>
                          <value>true</value>
                 </property>
</configuration>


  • ~/hadoop-2.2.0/etc/hadoop/mapred-site.xml
<configuration>
           <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
           </property>
          <property>
                  <name>mapreduce.jobhistory.address</name>
                  <value>master:10020</value>
          </property>
          <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>master:19888</value>
          </property>
</configuration>


  • ~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
               <property>
                       <name>yarn.nodemanager.aux-services</name>
                       <value>mapreduce_shuffle</value>
                </property>
                <property>                                  
                       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
                </property>
                <property>
                       <name>yarn.resourcemanager.address</name>
                       <value>master:8032</value>
               </property>
               <property>
                       <name>yarn.resourcemanager.scheduler.address</name>
                       <value>master:8030</value>
               </property>
               <property>
                    <name>yarn.resourcemanager.resource-tracker.address</name>
                     <value>master:8031</value>
              </property>
              <property>
                      <name>yarn.resourcemanager.admin.address</name>
                       <value>master:8033</value>
               </property>
               <property>
                       <name>yarn.resourcemanager.webapp.address</name>
                       <value>master:8088</value>
               </property>
</configuration>


  • 启动验证

        格式化hdfs:

/bin/hdfs namenode -format#没有这个命令就算了

    在hadoop文件夹下进入sbin目录,执行start-all.sh:

./start-all.sh

    通过jps查看hadoop状态,这时,master节点上有:namenode、secondarynamenode、resourcemanager

    slave节点上有:datanode nodemanaget

你可能感兴趣的:(hadoop搭建)