hadoop2.2环境搭建好后可以运行wordcount例子来查看一个文件中的单词数量,废话不多说,看下面的步骤:
首先在/usr/local/hadoop/下创建一个目录,是为了存放我们的测试文件,目录名称为myfile,在进入myfile中创建一个名称为wordcount.txt文件里面输入数据如下:
hello hadoop
hello java
hello world
运行命令hadoop fs -mkdir /input在hdfs中创建一个input目录;
运行命令hadoop fs -input /usr/hadoop/myfile/wordcount.txt /input/,将本地系统的wordcount.txt文件上传到hdfs的input目录中;
确保hdfs中的input目录下面没有out目录,否则会报错,将光标定位到/usr/local/hadoop/share/hadoop/mapreduce/目录中,然后运行下面的命令进行统计字母:
hadoop jar hadoop-mapreduce-examples-2.2.0.jar wordcount /input/wordcount.txt /input/out
下面是运行结束打印的结果:
[root@master mapreduce]# hadoop jar hadoop-mapreduce-examples-2.2.0.jar wordcount /input/wordcount.txt /input/out 14/03/09 19:32:19 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 14/03/09 19:32:22 INFO input.FileInputFormat: Total input paths to process : 1 14/03/09 19:32:22 INFO mapreduce.JobSubmitter: number of splits:1 14/03/09 19:32:22 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/03/09 19:32:22 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/03/09 19:32:22 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/03/09 19:32:22 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class 14/03/09 19:32:22 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/03/09 19:32:22 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/03/09 19:32:22 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class 14/03/09 19:32:22 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/03/09 19:32:22 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/03/09 19:32:22 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/03/09 19:32:22 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/03/09 19:32:22 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/03/09 19:32:23 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1394289329220_0002 14/03/09 19:32:25 INFO impl.YarnClientImpl: Submitted application application_1394289329220_0002 to ResourceManager at /0.0.0.0:8032 14/03/09 19:32:25 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1394289329220_0002/ 14/03/09 19:32:25 INFO mapreduce.Job: Running job: job_1394289329220_0002 14/03/09 19:32:55 INFO mapreduce.Job: Job job_1394289329220_0002 running in uber mode : false 14/03/09 19:32:55 INFO mapreduce.Job: map 0% reduce 0% 14/03/09 19:33:33 INFO mapreduce.Job: map 100% reduce 0% 14/03/09 19:33:45 INFO mapreduce.Job: map 100% reduce 100% 14/03/09 19:33:46 INFO mapreduce.Job: Job job_1394289329220_0002 completed successfully 14/03/09 19:33:47 INFO mapreduce.Job: Counters: 43 File System Counters FILE: Number of bytes read=54 FILE: Number of bytes written=158345 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=139 HDFS: Number of bytes written=32 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=36121 Total time spent by all reduces in occupied slots (ms)=7030 Map-Reduce Framework Map input records=3 Map output records=6 Map output bytes=60 Map output materialized bytes=54 Input split bytes=103 Combine input records=6 Combine output records=4 Reduce input groups=4 Reduce shuffle bytes=54 Reduce input records=4 Reduce output records=4 Spilled Records=8 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=588 CPU time spent (ms)=14810 Physical memory (bytes) snapshot=213233664 Virtual memory (bytes) snapshot=720707584 Total committed heap usage (bytes)=136908800 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=36 File Output Format Counters Bytes Written=32 [root@master mapreduce]#
查看结果:
[root@master mapreduce]# hadoop fs -lsr /input lsr: DEPRECATED: Please use 'ls -R' instead. drwxr-xr-x - root supergroup 0 2014-03-09 19:33 /input/out -rw-r--r-- 3 root supergroup 0 2014-03-09 19:33 /input/out/_SUCCESS -rw-r--r-- 3 root supergroup 32 2014-03-09 19:33 /input/out/part-r-00000 -rw-r--r-- 3 root supergroup 36 2014-03-09 19:26 /input/wordcount.txt [root@master mapreduce]# hadoop fs -cat /input/out/part-r-00000 hadoop 1 hello 3 java 1 world 1 [root@master mapreduce]#
成功!
转载请注明出处: http://kevin12.iteye.com/blog/2028776