MapReduce 题:
在集群节点中/usr/hdp/2.4.3.0-227/hadoop-mapreduce/目录下,存在一个案例JAR 包 hadoop-mapreduce-examples.jar。运行 JAR 包中的 PI 程序来进行计算圆周率π的近似值,要求运行 5 次 Map 任务,每个 Map 任务的投掷次数为 5。
root@master ~]# su hdfs
[hdfs@master ~]$ cd /usr/hdp/2.6.1.0-129/hadoop-mapreduce/
[hdfs@master hadoop-mapreduce]$ hadoop jar hadoop-mapreduce-examples.jar pi 5 5
Number of Maps = 5
Samples per Map = 5
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Starting Job
19/05/03 16:08:42 INFO client.RMProxy: Connecting to ResourceManager at slaver1.hadoop/10.0.0.104:8050
19/05/03 16:08:42 INFO client.AHSProxy: Connecting to Application History server at slaver1.hadoop/10.0.0.104:10200
19/05/03 16:08:42 INFO input.FileInputFormat: Total input paths to process : 5
19/05/03 16:08:42 INFO mapreduce.JobSubmitter: number of splits:5
19/05/03 16:08:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1556738815524_0004
19/05/03 16:08:43 INFO impl.YarnClientImpl: Submitted application application_1556738815524_0004
19/05/03 16:08:43 INFO mapreduce.Job: The url to track the job: http:// slaver1.hadoop:8088/proxy/application_1556738815524_0004/
19/05/03 16:08:43 INFO mapreduce.Job: Running job: job_1556738815524_0004
19/05/03 16:08:50 INFO mapreduce.Job: Job job_1556738815524_0004 running in uber mode : false
19/05/03 16:08:50 INFO mapreduce.Job: map 0% reduce 0%
19/05/03 16:08:57 INFO mapreduce.Job: map 20% reduce 0%
19/05/03 16:08:58 INFO mapreduce.Job: map 40% reduce 0%
19/05/03 16:09:01 INFO mapreduce.Job: map 60% reduce 0%
19/05/03 16:09:04 INFO mapreduce.Job: map 80% reduce 0%
19/05/03 16:09:05 INFO mapreduce.Job: map 100% reduce 0%
19/05/03 16:09:09 INFO mapreduce.Job: map 100% reduce 100%
19/05/03 16:09:10 INFO mapreduce.Job: Job job_1556738815524_0004 completed successfully
19/05/03 16:09:10 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=116
FILE: Number of bytes written=886989
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1340
HDFS: Number of bytes written=215
HDFS: Number of read operations=23
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=5
Launched reduce tasks=1
Data-local map tasks=5
Total time spent by all maps in occupied slots (ms)=44726
Total time spent by all reduces in occupied slots (ms)=7838
Total time spent by all map tasks (ms)=22363
Total time spent by all reduce tasks (ms)=3919
Total vcore-milliseconds taken by all map tasks=22363
Total vcore-milliseconds taken by all reduce tasks=3919
Total megabyte-milliseconds taken by all map tasks=34349568
Total megabyte-milliseconds taken by all reduce tasks=8026112
Map-Reduce Framework
Map input records=5
Map output records=10
Map output bytes=90
Map output materialized bytes=140
Input split bytes=750
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=140
Reduce input records=10
Reduce output records=0
Spilled Records=20
Shuffled Maps =5
Failed Shuffles=0
Merged Map outputs=5
GC time elapsed (ms)=400
CPU time spent (ms)=5840
Physical memory (bytes) snapshot=5756882944
Virtual memory (bytes) snapshot=19876769792
Total committed heap usage (bytes)=5479333888
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=590
File Output Format Counters
Bytes Written=97
Job Finished in 28.341 seconds
Estimated value of Pi is 3.68000000000000000000
在集群节点中/usr/hdp/2.4.3.0-227/hadoop-mapreduce/目录下,存在一个案例JAR 包 hadoop-mapreduce-examples.jar。运行 JAR 包中的 wordcount 程序来对/1daoyun/file/BigDataSkills.txt 文件进行单词计数,将运算结果输出到/1daoyun/output 目录中,使用相关命令查询单词计数结果。
root@master ~]# su hdfs
[hdfs@master ~]$ cd /usr/hdp/2.6.1.0-129/hadoop-mapreduce/
[hdfs@master hadoop-mapreduce]$ hadoop jar hadoop-mapreduce-examples.jar wordcount /1daoyun/file/BigDataSkills.txt /1daoyun/output
19/05/03 16:13:07 INFO client.RMProxy: Connecting to ResourceManager at slaver1.hadoop/10.0.0.104:8050
19/05/03 16:13:07 INFO client.AHSProxy: Connecting to Application History server at slaver1.hadoop/10.0.0.104:10200
19/05/03 16:13:08 INFO input.FileInputFormat: Total input paths to process : 1
19/05/03 16:13:08 INFO mapreduce.JobSubmitter: number of splits:1
19/05/03 16:13:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1556738815524_0005
19/05/03 16:13:09 INFO impl.YarnClientImpl: Submitted application application_1556738815524_0005
19/05/03 16:13:09 INFO mapreduce.Job: The url to track the job: http:// slaver1.hadoop:8088/proxy/application_1556738815524_0005/
19/05/03 16:13:09 INFO mapreduce.Job: Running job: job_1556738815524_0005
19/05/03 16:13:17 INFO mapreduce.Job: Job job_1556738815524_0005 running in uber mode : false
19/05/03 16:13:17 INFO mapreduce.Job: map 0% reduce 0%
19/05/03 16:13:23 INFO mapreduce.Job: map 100% reduce 0%
19/05/03 16:13:30 INFO mapreduce.Job: map 100% reduce 100%
19/05/03 16:13:31 INFO mapreduce.Job: Job job_1556738815524_0005 completed successfully
19/05/03 16:13:31 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=158
FILE: Number of bytes written=295257
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=265
HDFS: Number of bytes written=104
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=7322
Total time spent by all reduces in occupied slots (ms)=10228
Total time spent by all map tasks (ms)=3661
Total time spent by all reduce tasks (ms)=5114
Total vcore-milliseconds taken by all map tasks=3661
Total vcore-milliseconds taken by all reduce tasks=5114
Total megabyte-milliseconds taken by all map tasks=5623296
Total megabyte-milliseconds taken by all reduce tasks=10473472
Map-Reduce Framework
Map input records=11
Map output records=22
Map output bytes=230
Map output materialized bytes=158
Input split bytes=121
Combine input records=22
Combine output records=12
Reduce input groups=12
Reduce shuffle bytes=158
Reduce input records=12
Reduce output records=12
Spilled Records=24
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=116
CPU time spent (ms)=2220
Physical memory (bytes) snapshot=1325559808
Virtual memory (bytes) snapshot=6946390016
Total committed heap usage (bytes)=1269301248
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=144
File Output Format Counters
Bytes Written=104
[hdfs@master hadoop-mapreduce]$ hadoop fs -cat /1daoyun/output/part-r-00000
docker 1
elasticsearch 1
flume 1
hadoop 5
hbase 1
hive 3
kafka 1
redis 1
solr 1
spark 5
sqoop 1
storm 1
在集群节点中/usr/hdp/2.4.3.0-227/hadoop-mapreduce/目录下,存在一个案例JAR 包 hadoop-mapreduce-examples.jar。运行 JAR 包中的 sudoku 程序来计算下表中数独运算题的结果。。
root@master ~]# su hdfs
[hdfs@master ~]$ cd /usr/hdp/2.6.1.0-129/hadoop-mapreduce/
[hdfs@master hadoop-mapreduce]$ hadoop jar hadoop-mapreduce-examples.jar sudoku /opt/puzzle1.dta
Solving /opt/puzzle1.dta
8 1 2 7 5 3 6 4 9
9 4 3 6 8 2 1 7 5
6 7 5 4 9 1 2 8 3
1 5 4 2 3 7 8 9 6
3 6 9 8 4 5 7 2 1
2 8 7 1 6 9 5 3 4
5 2 1 9 7 4 3 6 8
4 3 8 5 2 6 9 1 7
7 9 6 3 1 8 4 5 2
Found 1 solutions
4.在集群节点中/usr/hdp/2.4.3.0-227/hadoop-mapreduce/目录下,存在一个案例JAR 包 hadoop-mapreduce-examples.jar。运行 JAR 包中的 grep 程序来统计文件系统中/1daoyun/file/BigDataSkills.txt 文件中“Hadoop”出现的次数,统计完成后,查询统计结果信息。
root@master ~]# su hdfs
[hdfs@master ~]$ cd /usr/hdp/2.6.1.0-129/hadoop-mapreduce/
[hdfs@master hadoop-mapreduce]$ hadoop jar hadoop-mapreduce-examples.jar grep /1daoyun/file/BigDataSkills.txt /1daoyun/output hadoop