hadoop HDFS

hadoop在linux系统下的单节点伪分布式配置

实验环境: rhel6.3   && iptables and selinux disabled     jdk: jdk-6u26-linux-x64.bin

                 hadoop版本: hadoop-1.2.1.tar.gz 

下载 && 安装 JDK

http://www.oracle.com/technetwork/java/javaee/downloads/java-ee-sdk-6u3-jdk-6u29-downloads-523388.html

#sh jdk-6u26-linux-x64.bin
#mv jdk1.6.0_32/ /usr/local/jdk

下载hadoop源码包

http://hadoop.apache.org/

解压至指定目录 && 精简目录名称

#tar zxf hadoop-1.2.1.tar.gz -C /usr/local
#mv hadoop1.2.1/ hadoop

配置JAVA环境变量

#mv hadoop1.2.1/ hadoop
#cd /usr/local/hadoop/
#vim conf/hadoop-env.sh
      export JAVA_HOME=/usr/local/hadoop/jdk

编辑配置文件

http://hadoop.apache.org/docs/r1.2.1/single_node_setup.html

#vim core-site.xml 

 <configuration>

         <property>
                  <name>fs.default.name</name>
                           <value>hdfs://localhost:9000</value>
                                </property>
</configuration>

#vim hdfs-site.xml

<configuration>

        <property>
                 <name>dfs.replication</name>
                          <value>1</value>
                               </property>
</configuration>

#vim mapred-site.xml

<configuration>

 <property>
          <name>mapred.job.tracker</name>
                   <value>localhost:9001</value>
                        </property>
</configuration>

 check that you can ssh to the localhost without a passphrase

#ssh-keygen
#ssh-copy-id localhost
#ssh localhost

格式化文件系统 && 启动所有服务

#cd /usr/local/hadoop/bin/
#./hadoop namenode -format
#./start-all.sh

查看所有服务进程以及PID

#/usr/local/hadoop/jdk/bin/jps
5147 Jps
2460 TaskTracker
2176 DataNode
2276 SecondaryNameNode
2077 NameNode
2350 JobTracker

检测

上传/usr/local/hadoop/conf/ 至 input/

#cd /usr/local/hadoop
#bin/hadoop fs -put conf input
#bin/hadoop fs -ls
     drwxr-xr-x   - root supergroup          0 2014-03-08 03:22 /user/root/input

outpot/  目录

#bin/hadoop jar hadoop-examples-1.1.2.jar grep input output 'dfs[a-z.]+'
#bin/hadoop fs -ls
#bin/hadoop fs -cat output/*         #查看output目录
1    dfs.replication
1    dfs.server.namenode.
1    dfsadmin

Hadoop 重要的端口:
1.Job Tracker 管理界面:50030
2.HDFS 管理界面 :50070
3.HDFS通信端口:9000
4.MapReduce通信端口:9001 

1. HDFS 界面
        http://localhost:50070
2. MapReduce 管理界面
        http://holocalhost:50030


HDFS:

      NameNode  :管理节点

     DataNode   :数据节点

     SecondaryNamenode : 数据源信息备份整理节点

MapReduce:

       JobTracker  :任务管理节点

       Tasktracker  :任务运行节点


你可能感兴趣的:(hadoop HDFS)