hadoop集群配置

hadoop部署:

登录用户:suse
HOSTNAME:server0

4台机器:
192.168.2.10  server0(namenode)
192.168.2.11  server1(datanode)
192.168.2.12  server2(datanode)
192.168.2.13  server3(datanode)

1.首先强调的是 确保各个机器节点之间(任意两台机器之间可以ping IP/主机名 可以ping通)
  ping  IP
  ping  主机名
 
2.修改server0的 /etc/hosts 配置文件
  vi /etc/hosts  在 localhost的下面添加(位置任意)
 
  127.0.0.1       localhost
  #hadoop master
  192.168.2.10    server0
  192.168.2.11    server1
  192.168.2.12    server2
  192.168.2.13    server3
 
  **修改 server0  (默认可能为localhost)
 
3.修改server1,server2,server3的 /etc/hosts 配置文件
  (server1,server2,server3)均如下所示

  127.0.0.1       localhost
  192.168.2.11    server1
  192.168.2.12    server2
  192.168.2.13    server3

  **   **修改 server1/server2/server3  (默认可能为localhost)

4.SSH设置
  1>所有节点生成RSA密钥对
    ssh-keygen -t rsa(拷贝执行)
 
  2>一直回车 默认保存路径为/home/suse/.ssh/下
  3>将master的生成的id_rsa,id_rsa.pub 写入authorized_keys
   
    cat id_rsa.pub >> authorized_keys
    cat id_rsa >> authorized_keys
   
  4>将所有slave的 id_rsa.pub 都写入 master的 authorized_keys,最后将master下的 authorized_keys 分发到所有slave
 
   scp /home/suse/.ssh/authorized_keys  server1:/home/suse/.ssh/
   scp /home/suse/.ssh/authorized_keys  server2:/home/suse/.ssh/
   scp /home/suse/.ssh/authorized_keys  server3:/home/suse/.ssh/
  
   之后通过ssh 各主机名称 验证是否通过
  
   最后,在所有机器执行用户权限命令chmod,命令如下
   chmod 644 authorized_keys
  第一次连接需要密码,输入yes和机器密码就可以。以后即不用再输入
 
5.配置文件

   core-site.xml
    <property>
          <name>fs.default.name</name>
          <value>hdfs://server0:9000</value>
          <description>The name of the default file system. A URI whose
                  scheme and authority determine the FileSystem implementation. The
                  uri's scheme determines the config property (fs.SCHEME.impl) naming
                  the FileSystem implementation class. The uri's authority is used to
                  determine the host, port, etc. for a filesystem.</description>
  </property>

    <property>
          <name>dfs.datanode.socket.write.timeout</name>
          <value>0</value>
          <description>A base for other temporary directories.</description>
   </property>

 
  fs.default.name:hadoop 文件系统路径 (配置文件中一定要用主机名。如果用ip的话 以后各节点通信会不能正确解析)
  dfs.datanode.socket.write.timeout  防止socket 异常
  ----------------------------------------------------------------------------------------------------
  hdfs-site.xml
 
    <property>
        <name>dfs.name.dir</name>
        <value>/server/bin/hadoop/name</value>
        <description>
                hadoop文件元数据空间
        </description>
    </property>

  <property>
          <name>dfs.http.address</name>
          <value>192.168.2.10:50070</value>
          <description>
              NameNode HTTP状态监视地址
          </description>
  </property>


  <property>
          <name>hadoop.tmp.dir</name>
          <value>/server/bin/hadoop/temp</value>
          <description>
              hadoop临时目录
          </description>
  </property>

  <property>
          <name>dfs.data.dir</name>
          <value>/server/bin/hadoop/data</value>
          <description>
              N数据本地目录
          </description>
  </property>


  <property>
          <name>dfs.replication</name>
          <value>2</value>
          <description>
              复本数
          </description>

  </property>

-----------------------------------------------------------------------------------------
    mapred-site.xml
    <property>
              <name>mapred.job.tracker</name>
              <value>server0:9001</value>
              <description>The host and port that the MapReduce job tracker runs
                      at. If "local", then jobs are run in-process as a single map
                      and reduce task.
              </description>
      </property>

      <property>
              <name>mapred.map.tasks</name>
              <value>8</value>
              <description>The default number of map tasks per job.
                      Ignored when mapred.job.tracker is "local".
              </description>
      </property>

      <property>
              <name>mapred.reduce.tasks</name>
              <value>8</value>
              <description>The default number of map tasks per job
                     Ignored when mapred.job.tracker is "local".
              </description>
      </property>

      <property>

              <name>mapred.local.dir</name>

              <value>/server/bin/hadoop/mapred/local</value>

              <description>tasktracker上执行mapreduce程序时的本地目录</description>

      </property>

      <property>
              <name>mapred.system.dir</name>
              <value>/tmp/hadoop/mapred/system</value>

      </property>
     
  -----------------------------------------------------------------------------------------------------------------
  master配置
 
   localhost
   server0
  
  ------------------------------------------------------------------------------------------------------------------
  slave 配置
 
  localhost
  server1
  server2
  server3
 
  ----------------------------------------------------------------------------------------------------------------------
 
6.拷贝
  将配置好的hadoop 文件拷贝到所有的 slave中
 
7.启动
  启动之前 现format下hdfs文件
  bin/hadoop namenode -format
 
  bin/start-all.sh
  bin/stop-all.sh
 
8.测试是否启动成功

  jps 命令 :显示如下
  suse@localhost:/server/bin/hadoop/logs> jps

  12490 TaskTracker
  11854 NameNode
  12343 JobTracker
  12706 Jps
  3832 SecondaryNameNode
  11992 DataNode
  suse@localhost:/server/

http://localhost:50070 (查看节点数)
http://localhost:50030 (查看任务执行情况)
  
   
               

你可能感兴趣的:(mapreduce,hadoop,NameNode,SuSE)