hadoop wordCount 第二步迈出去了 ^_^

  开始不理解 hadoop fs -put input1.txt in
所以在/data 下新建了 in 文件夹,结果运行测试程序都不对。
hadoop 是个文件系统,它有自己的管理方式和存储方式.你在本地文件系统是找不到这个相同的目录的如果你能看见,那就变成 linux文件系统了,还能叫hadoop文件系统吗

[hadoop@localhost data]$ touct input1.txt

[hadoop@localhost data]$ vi input1.txt
input1.txt文件内容
hello   world
hello   ray
hello   Hadoop
[hadoop@localhost data]$ hadoop fs -mkdir /in


[hadoop@localhost data]$ hadoop fs -put input1.txt in

[hadoop@localhost data]$ hadoop fs -put input1.txt /in

[hadoop@localhost mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.2.0.jar wordcount /in /output2
14/07/27 18:31:18 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:5271
14/07/27 18:31:19 INFO input.FileInputFormat: Total input paths to process : 1
14/07/27 18:31:19 INFO mapreduce.JobSubmitter: number of splits:1
14/07/27 18:31:19 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/07/27 18:31:19 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/07/27 18:31:19 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/07/27 18:31:19 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
14/07/27 18:31:19 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/07/27 18:31:19 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/07/27 18:31:19 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
14/07/27 18:31:19 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/07/27 18:31:19 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/07/27 18:31:19 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/07/27 18:31:19 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/07/27 18:31:19 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/07/27 18:31:19 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1406456184794_0006
14/07/27 18:31:19 INFO impl.YarnClientImpl: Submitted application application_1406456184794_0006 to ResourceManager at localhost/127.0.0.1:5271
14/07/27 18:31:19 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1406456184794_0006/
14/07/27 18:31:19 INFO mapreduce.Job: Running job: job_1406456184794_0006
14/07/27 18:31:28 INFO mapreduce.Job: Job job_1406456184794_0006 running in uber mode : false
14/07/27 18:31:28 INFO mapreduce.Job:  map 0% reduce 0%
14/07/27 18:31:34 INFO mapreduce.Job:  map 100% reduce 0%
14/07/27 18:31:41 INFO mapreduce.Job:  map 100% reduce 100%
14/07/27 18:31:41 INFO mapreduce.Job: Job job_1406456184794_0006 completed successfully
14/07/27 18:31:41 INFO mapreduce.Job: Counters: 43
File System Counters
FILE: Number of bytes read=53
FILE: Number of bytes written=157577
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=144
HDFS: Number of bytes written=31
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=4245
Total time spent by all reduces in occupied slots (ms)=4384
Map-Reduce Framework
Map input records=4
Map output records=6
Map output bytes=59
Map output materialized bytes=53
Input split bytes=100
Combine input records=6
Combine output records=4
Reduce input groups=4
Reduce shuffle bytes=53
Reduce input records=4
Reduce output records=4
Spilled Records=8
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=67
CPU time spent (ms)=1230
Physical memory (bytes) snapshot=384798720
Virtual memory (bytes) snapshot=1753681920
Total committed heap usage (bytes)=273678336
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=44
File Output Format Counters
Bytes Written=31
[hadoop@localhost mapreduce]$ hadoop fs -ls /output2
Found 2 items
-rw-r--r--   1 hadoop supergroup          0 2014-07-27 18:31 /output2/_SUCCESS
-rw-r--r--   1 hadoop supergroup         31 2014-07-27 18:31 /output2/part-r-00000
[hadoop@localhost mapreduce]$ hadoop fs -cat /output2/_SUCCESS
[hadoop@localhost mapreduce]$ hadoop fs -cat /output2/part-r-00000
Hadoop 1
hello 3
ray 1
world 1
第二步迈出去了 ^_^
* 文件操作
* 查看目录文件
* $ hadoop fs -ls /user/cl
*
* 创建文件目录
* $ hadoop fs -mkdir /user/cl/temp
*
* 删除文件
* $ hadoop fs -rm /user/cl/temp/a.txt
*
* 删除目录与目录下所有文件
* $ hadoop fs -rmr /user/cl/temp
*
* 上传文件
* 上传一个本机/home/cl/local.txt到hdfs中/user/cl/temp目录下
* $ hadoop fs -put /home/cl/local.txt /user/cl/temp
*
* 下载文件
* 下载hdfs中/user/cl/temp目录下的hdfs.txt文件到本机/home/cl/中
* $ hadoop fs -get /user/cl/temp/hdfs.txt /home/cl
*
* 查看文件
* $ hadoop fs –cat /home/cl/hdfs.txt
*
* Job操作
* 提交MapReduce Job, Hadoop所有的MapReduce Job都是一个jar包

* $ hadoop jar <local-jar-file> <java-class> <hdfs-input-file> <hdfs-output-dir>

* $ hadoop jar sandbox-mapred-0.0.20.jar sandbox.mapred.WordCountJob /user/cl/input.dat /user/cl/outputdir
*
* 杀死某个正在运行的Job
* 假设Job_Id为:job_201207121738_0001
* $ hadoop job -kill job_201207121738_0001

你可能感兴趣的:(mapreduce,hadoop)