mac 10.9.5 安装hadoop 1.2.1 运行wordcount


1  在终端上输入 ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/homebrew/go/install)" (这个命令用来安装brew,如果之前已经安装了可以省略

2         brew install homebrew/versions/hadoop121 //这样可以选择安装hadoop的版本

3. 安装完毕后需要设置路径和环境变量

    export HADOOP_HOME="/usr/local/Cellar/hadoop121/1.2.1/libexec"

    export HADOOP_VERSION="1.2.1"

    PATH=/usr/local/Cellar/hadoop121/1.2.1/libexec/bin:$PATH

    export PATH

4  $ ssh-keygen -t rsa -P "" 

     按下enter

    $ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

    $ ssh localhost


5. 配置伪分布式环境    文件位置为

 /usr/local/Cellar/hadoop121/1.2.1/libexec/conf

    需要配置如下四个文件    

    hadoop-env.sh

    core-site.xml

    hdfs-site.xml

    mapred-site.xml


   hadoop-env.sh

   增加  export HADOOP_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc="


    core-site.xml

    
     
        fs.default.name
        hdfs://localhost:9000
     

   
      hadoop.tmp.dir
      /tmp/hadoop-${user.name}
      A base for other temporary directories.
   

 

  

   hdfs-site.xml

   
     
       dfs.replication
       1
     

   


   mapred-site.xml

   
     
       mapred.job.tracker
       localhost:9001
     

     
       mapred.tasktracker.map.tasks.maximum
       4
     

     
       mapred.tasktracker.reduce.tasks.maximum
       2
   

 

6. 之后需要对namenode进行format

    $ hadoop namenode -format


7. 启动hadoop

    $ /usr/local/Cellar/hadoop/1.1.1/libexec/bin/start-all.sh

    如果设置了上面3中的环境变量,只需要键入 start-all.sh


8. 查看hadoop运行状况

    $ jps
    49770 TaskTracker
    49678 JobTracker
    49430 NameNode
    49522 DataNode
    49615 SecondaryNameNode
    49823 Jps

验证hadoop是否启动成功

 

  • NameNode - http://localhost:50070/
  • JobTracker - http://localhost:50030/



运行wordcount

cd /usr/local/Cellar/hadoop121/1.2.1/libexec/bin


hadoop dfs -mkdir /input


hadoop dfs -put ./*.sh /input/


cd ..


ls

bin hadoop-minicluster-1.2.1.jar

conf hadoop-test-1.2.1.jar

contrib hadoop-tools-1.2.1.jar

hadoop-ant-1.2.1.jar input

hadoop-client-1.2.1.jar lib

hadoop-core-1.2.1.jar logs

hadoop-examples-1.2.1.jar webapps


hadoop jar hadoop-examples-1.2.1.jar wordcount /input /output


Warning: $HADOOP_HOME is deprecated.


15/05/07 16:36:46 INFO input.FileInputFormat: Total input paths to process : 14

15/05/07 16:36:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

15/05/07 16:36:47 WARN snappy.LoadSnappy: Snappy native library not loaded

15/05/07 16:36:47 INFO mapred.JobClient: Running job: job_201505071548_0001

15/05/07 16:36:48 INFO mapred.JobClient:  map 0% reduce 0%

15/05/07 16:37:04 INFO mapred.JobClient:  map 7% reduce 0%

15/05/07 16:37:05 INFO mapred.JobClient:  map 14% reduce 0%

15/05/07 16:37:06 INFO mapred.JobClient:  map 21% reduce 0%

15/05/07 16:37:10 INFO mapred.JobClient:  map 28% reduce 0%

15/05/07 16:37:20 INFO mapred.JobClient:  map 35% reduce 0%

15/05/07 16:37:22 INFO mapred.JobClient:  map 50% reduce 0%

15/05/07 16:37:24 INFO mapred.JobClient:  map 57% reduce 0%

15/05/07 16:37:29 INFO mapred.JobClient:  map 64% reduce 0%

15/05/07 16:37:31 INFO mapred.JobClient:  map 64% reduce 19%

15/05/07 16:37:35 INFO mapred.JobClient:  map 78% reduce 19%

15/05/07 16:37:37 INFO mapred.JobClient:  map 78% reduce 26%

15/05/07 16:37:40 INFO mapred.JobClient:  map 85% reduce 26%

15/05/07 16:37:43 INFO mapred.JobClient:  map 92% reduce 26%

15/05/07 16:37:46 INFO mapred.JobClient:  map 100% reduce 26%

15/05/07 16:37:47 INFO mapred.JobClient:  map 100% reduce 28%

15/05/07 16:37:50 INFO mapred.JobClient:  map 100% reduce 100%

15/05/07 16:37:54 INFO mapred.JobClient: Job complete: job_201505071548_0001

15/05/07 16:37:54 INFO mapred.JobClient: Counters: 26

15/05/07 16:37:54 INFO mapred.JobClient:   Map-Reduce Framework

15/05/07 16:37:54 INFO mapred.JobClient:     Spilled Records=4122

15/05/07 16:37:54 INFO mapred.JobClient:     Map output materialized bytes=29624

15/05/07 16:37:54 INFO mapred.JobClient:     Reduce input records=2061

15/05/07 16:37:54 INFO mapred.JobClient:     Map input records=712

15/05/07 16:37:54 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1517

15/05/07 16:37:54 INFO mapred.JobClient:     Map output bytes=35429

15/05/07 16:37:54 INFO mapred.JobClient:     Reduce shuffle bytes=29624

15/05/07 16:37:54 INFO mapred.JobClient:     Reduce input groups=548

15/05/07 16:37:54 INFO mapred.JobClient:     Combine output records=2061

15/05/07 16:37:54 INFO mapred.JobClient:     Reduce output records=548

15/05/07 16:37:54 INFO mapred.JobClient:     Map output records=3253

15/05/07 16:37:54 INFO mapred.JobClient:     Combine input records=3253

15/05/07 16:37:54 INFO mapred.JobClient:     Total committed heap usage (bytes)=2295857152

15/05/07 16:37:54 INFO mapred.JobClient:   File Input Format Counters 

15/05/07 16:37:54 INFO mapred.JobClient:     Bytes Read=23246

15/05/07 16:37:54 INFO mapred.JobClient:   FileSystemCounters

15/05/07 16:37:54 INFO mapred.JobClient:     HDFS_BYTES_READ=24763

15/05/07 16:37:54 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=913307

15/05/07 16:37:54 INFO mapred.JobClient:     FILE_BYTES_READ=29546

15/05/07 16:37:54 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=6803

15/05/07 16:37:54 INFO mapred.JobClient:   Job Counters 

15/05/07 16:37:54 INFO mapred.JobClient:     Launched map tasks=14

15/05/07 16:37:54 INFO mapred.JobClient:     Launched reduce tasks=1

15/05/07 16:37:54 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=45013

15/05/07 16:37:54 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0

15/05/07 16:37:54 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=196044

15/05/07 16:37:54 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0

15/05/07 16:37:54 INFO mapred.JobClient:     Data-local map tasks=14

15/05/07 16:37:54 INFO mapred.JobClient:   File Output Format Counters 

15/05/07 16:37:54 INFO mapred.JobClient:     Bytes Written=6803


  1. 注:input 和 output都是hdfs文件目录。不同的是input是需要分析的目录,outpu是存放结果的目录,且自动生成,不能手动创建  

promote:libexec yangfeng$ hadoop dfs -ls /input

Warning: $HADOOP_HOME is deprecated.


Found 14 items

-rw-r--r--   1 yangfeng supergroup       2643 2015-05-07 16:35 /input/hadoop-config.sh

-rw-r--r--   1 yangfeng supergroup       5064 2015-05-07 16:35 /input/hadoop-daemon.sh

-rw-r--r--   1 yangfeng supergroup       1329 2015-05-07 16:35 /input/hadoop-daemons.sh

-rw-r--r--   1 yangfeng supergroup       2050 2015-05-07 16:35 /input/slaves.sh

-rw-r--r--   1 yangfeng supergroup       1166 2015-05-07 16:35 /input/start-all.sh

-rw-r--r--   1 yangfeng supergroup       1065 2015-05-07 16:35 /input/start-balancer.sh

-rw-r--r--   1 yangfeng supergroup       1745 2015-05-07 16:35 /input/start-dfs.sh

-rw-r--r--   1 yangfeng supergroup       1145 2015-05-07 16:35 /input/start-jobhistoryserver.sh

-rw-r--r--   1 yangfeng supergroup       1259 2015-05-07 16:35 /input/start-mapred.sh

-rw-r--r--   1 yangfeng supergroup       1119 2015-05-07 16:35 /input/stop-all.sh

-rw-r--r--   1 yangfeng supergroup       1116 2015-05-07 16:35 /input/stop-balancer.sh

-rw-r--r--   1 yangfeng supergroup       1246 2015-05-07 16:35 /input/stop-dfs.sh

-rw-r--r--   1 yangfeng supergroup       1131 2015-05-07 16:35 /input/stop-jobhistoryserver.sh

-rw-r--r--   1 yangfeng supergroup       1168 2015-05-07 16:35 /input/stop-mapred.sh


你可能感兴趣的:(云计算)