记录我的hadoop学习历程2--运行 wordcount

首先启动

sh /usr/local/hadoop/sbin/start-all.sh

导入数据到hdfs(当前位置为 hadoop 根目录)

1、创建数据仓库目录

./bin/hadoop dfs -mkdir -p /user/guoyakui/hadoopfile

即:./bin/hadoop dfs -mkdir -p /user/用户名/自定义文件夹

2、拷贝数据到数据仓库

./bin/hadoop dfs -copyFromLocal  /Users/guoyakui/Desktop/hadoop/data  /user/guoyakui/hadoopfile

即:./bin/hadoop dfs -copyFromLocal  本地数据地址  数据仓库地址(上面建立的目录)

3、拷贝完成之后可以查看一下

./bin/hadoop dfs -ls /user/guoyakui/hadoopfile

即:  ./bin/hadoop dfs -ls /user/用户名/自定义目录
输出:

┌─[guoyakui@guoyakuideMBP] - [/usr/local/hadoop] - [  5 23, 15:47]
└─[$] <> ./bin/hadoop dfs -ls /user/guoyakui/hadoopfile/data
-rw-r--r--   1 guoyakui supergroup    1580879 2017-05-23 14:56 /user/guoyakui/hadoopfile/data/4300-0.txt
-rw-r--r--   1 guoyakui supergroup    1428841 2017-05-23 14:56 /user/guoyakui/hadoopfile/data/5000-8.txt
-rw-r--r--   1 guoyakui supergroup     674570 2017-05-23 14:56 /user/guoyakui/hadoopfile/data/pg20417.txt

4、运行examples-wordcount

./bin/hadoop jar share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.8.0-sources.jar org.apache.hadoop.examples.WordCount  /user/guoyakui/hadoopfile/data /user/guoyakui/hadoopfile/data-output

即: ./bin/hadoop jar example的jar包地址  具体的功能 输入源 输出地
输出:(输出内容较多,只截取了一部分)
17/05/23 15:50:16 INFO mapred.LocalJobRunner: reduce > reduce
17/05/23 15:50:16 INFO mapred.Task: Task 'attempt_local1414386995_0001_r_000000_0' done.
17/05/23 15:50:16 INFO mapred.LocalJobRunner: Finishing task: attempt_local1414386995_0001_r_000000_0
17/05/23 15:50:16 INFO mapred.LocalJobRunner: reduce task executor complete.
17/05/23 15:50:17 INFO mapreduce.Job:  map 100% reduce 100%
17/05/23 15:50:17 INFO mapreduce.Job: Job job_local1414386995_0001 completed successfully
17/05/23 15:50:17 INFO mapreduce.Job: Counters: 35
    File System Counters
        FILE: Number of bytes read=4121088
        FILE: Number of bytes written=8782066
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=11959179
        HDFS: Number of bytes written=879197
        HDFS: Number of read operations=33
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=6
    Map-Reduce Framework
        Map input records=78096
        Map output records=629882
        Map output bytes=6091113
        Map output materialized bytes=1454541
        Input split bytes=403
        Combine input records=629882
        Combine output records=100609
        Reduce input groups=81942
        Reduce shuffle bytes=1454541
        Reduce input records=100609
        Reduce output records=81942
        Spilled Records=201218
        Shuffled Maps =3
        Failed Shuffles=0
        Merged Map outputs=3
        GC time elapsed (ms)=14
        Total committed heap usage (bytes)=1789919232
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters
        Bytes Read=3684290
    File Output Format Counters
        Bytes Written=879197

5、查看运行结果

a、查看输出的文件
./bin/hadoop dfs -ls /user/guoyakui/hadoopfile/data-output
┌─[guoyakui@guoyakuideMBP] - [/usr/local/hadoop] - [  5 23, 15:50]
└─[$] <> ./bin/hadoop dfs -ls /user/guoyakui/hadoopfile/data-output
Found 2 items
-rw-r--r--   1 guoyakui supergroup          0 2017-05-23 15:21 /user/guoyakui/hadoopfile/data-output/_SUCCESS
-rw-r--r--   1 guoyakui supergroup     879197 2017-05-23 15:21 /user/guoyakui/hadoopfile/data-output/part-r-00000
b、查看文件内容
./bin/hadoop dfs -cat /user/guoyakui/hadoopfile/data-output/part-r-00000
输出:(内容较多,截取一少部分)
—A  40
—About  2
—Adiutorium 1
—Afraid 2After  7After, 1
—Afterwits, 1
—Again, 1
—Agonising  1
—Ah 3
—Ah,    10
—Aha!   1
—Aha... 1
—Ahem!  1
—Alas,  1All    8
—Am 2

你可能感兴趣的:(hadoop)