[hadoop]3.0.0以上版本运行hadoop-mapreduce-examples的wordcount官方示例

目录

前言:

1. 准备数据放到HDFS上面

2. 运行wordcount 

3. 查看结果


前言:

上一篇:[hadoop]3.0.0以上版本运行hadoop-mapreduce-examples的pi官方示例(踩坑日记)

这次来进行wordcount的demo测试~

从环境到运行,到博客输出花了两个小时。。。

这个demo应该能很快;

1. 准备数据放到HDFS上面

Q1:我的HDFS怎么上传呢?

这个是传一个文件到hdfs的根目录 后面的/

hadoop fs -put /Users/bjhl/Documents/工作记录/hadoop/data.txt /

然后看一下内容 

$ hadoop fs -ls /

2021-01-26 20:50:05,491 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Found 3 items

-rw-r--r--   1 bjhl supergroup      20943 2021-01-26 20:49 /data.txt

drwx------   - bjhl supergroup          0 2021-01-26 20:01 /tmp

drwxr-xr-x   - bjhl supergroup          0 2021-01-26 20:01 /user

可以通过:

http://localhost:9870/explorer.html#/ 来看具体的文件内容,也可以通过图形界面上传。

[hadoop]3.0.0以上版本运行hadoop-mapreduce-examples的wordcount官方示例_第1张图片

2. 运行wordcount 

hadoop jar hadoop-mapreduce-examples-3.3.0.jar wordcount /data.txt /wordcount

hadoop jar hadoop-mapreduce-examples-3.3.0.jar wordcount /data.txt /wordcount



2021-01-26 21:13:41,841 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-01-26 21:13:42,311 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:18040
2021-01-26 21:13:42,896 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/bjhl/.staging/job_1611662749925_0002
2021-01-26 21:13:43,046 INFO input.FileInputFormat: Total input files to process : 1
2021-01-26 21:13:43,088 INFO mapreduce.JobSubmitter: number of splits:1
2021-01-26 21:13:43,180 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1611662749925_0002
2021-01-26 21:13:43,182 INFO mapreduce.JobSubmitter: Executing with tokens: []
2021-01-26 21:13:43,313 INFO conf.Configuration: resource-types.xml not found
2021-01-26 21:13:43,313 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2021-01-26 21:13:43,361 INFO impl.YarnClientImpl: Submitted application application_1611662749925_0002
2021-01-26 21:13:43,397 INFO mapreduce.Job: The url to track the job: http://localhost:18088/proxy/application_1611662749925_0002/
2021-01-26 21:13:43,397 INFO mapreduce.Job: Running job: job_1611662749925_0002
2021-01-26 21:13:48,460 INFO mapreduce.Job: Job job_1611662749925_0002 running in uber mode : false
2021-01-26 21:13:48,461 INFO mapreduce.Job:  map 0% reduce 0%
2021-01-26 21:13:52,523 INFO mapreduce.Job:  map 100% reduce 0%
2021-01-26 21:13:57,568 INFO mapreduce.Job:  map 100% reduce 100%
2021-01-26 21:13:57,582 INFO mapreduce.Job: Job job_1611662749925_0002 completed successfully
2021-01-26 21:13:57,670 INFO mapreduce.Job: Counters: 50
	File System Counters
		FILE: Number of bytes read=20617
		FILE: Number of bytes written=511487
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=21038
		HDFS: Number of bytes written=14452
		HDFS: Number of read operations=8
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
		HDFS: Number of bytes read erasure-coded=0
	Job Counters 
		Launched map tasks=1
		Launched reduce tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=1848
		Total time spent by all reduces in occupied slots (ms)=1904
		Total time spent by all map tasks (ms)=1848
		Total time spent by all reduce tasks (ms)=1904
		Total vcore-milliseconds taken by all map tasks=1848
		Total vcore-milliseconds taken by all reduce tasks=1904
		Total megabyte-milliseconds taken by all map tasks=1892352
		Total megabyte-milliseconds taken by all reduce tasks=1949696
	Map-Reduce Framework
		Map input records=177
		Map output records=3752
		Map output bytes=35769
		Map output materialized bytes=20617
		Input split bytes=95
		Combine input records=3752
		Combine output records=1552
		Reduce input groups=1552
		Reduce shuffle bytes=20617
		Reduce input records=1552
		Reduce output records=1552
		Spilled Records=3104
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=66
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=530055168
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=20943
	File Output Format Counters 
		Bytes Written=14452

3. 查看结果

$ hadoop fs -cat /wordcount/part-r-00000
2021-01-26 21:15:02,281 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
"About	1
"Am	1
"And	1
"Are	2
"Coming	1
"Constantine	1
"Dear,	1
"Did	1
"Do	1
"Either	1
"H'm!	1
"Hardly	1
"He	4
"I	13
"I'd	1
"I've	1
"It	4
"It's	2
"Lieutenant	2
"Lieutenant50	1
"Most	1
"Mother	1
"Necia!	1
"No	1
"No.	4
"Now	1
"Of	3
"Oh!	2
"Oh,	2
"Poleon	1
"Shakespeare	1
"So	1
"Some	1
"That	2
"The	3
"Then	1
"They	1
"This	1
"Ugh!	1
"Well!	1
"Well,	1
"What	1
"When	1
"Where	1
"Which	1
"Why,	1
"Yes.	1
"You	4
"and	2
"breeds,"	1
"divils	1
"up-river,"	1
'New	2
'breeds'	1
'savvy53'	1
A	2
ARE	1
Age	1
American	1
And	1
Arctic	1
At	5
Barge56."	1
Barnum	3
Beaded	1
Being	1
Beyond	1
Both	1
Broken	1
Burrell	6
Burrell,	1
Burrells	3
Canadian	1
Chandelar	1
Cheechakos.	1
Coming	1
Constantine's	1
Country'!"	1
Country,'	1
Creek10"	1
Dawson	1
Dawson,	1
Dear	1
Department,	1
Doret	1
Doret—who	1
Down-stream,	1
Each	1
Egyptian	1
Even	1
Every	1
Father	3
Flambeau,	1
Flambeau.	1
For	1
Forty	1
Francisco;	1
Frankfort	1
From	1
Gale	2
Gale's	1
Gale,	2
Gale.	1
Gale8's,	1
George."	1
Good-bye!"	1
He	18
He's	1
Her	4
His	3
How	2
I	41
I'll	1
I'm	1
I've	1
If	2
In	1
Indian	4
Instead,	1
It	5
I—I—"	1
John."	1
Kentuckian;	1
Kentucky	1
Kentucky."	1
Koyukuk	1
Koyukuk,	1
Lake	1
Le	1
Lee,	2
Lieutenant	1
Lieutenant's	1
Lower	1
Man	2
Many	1
Maybe	1
Meade	2
Meades	1
Mile,	1
Miss	1
Mission	1
Mission.	1
Molly	1
Moreover,	2
Mounted	1
Necia	5
Necia!"	1
Necia,	1
Necia.	4
New	1
Nor	1
North	1
North.	2
Not	1
Now	1
Oh,	1
Old	2
On	1
Perhaps	1
Poleon	6
Poleon!"	1
Poleon—he	1
Police	1
QUITE	1
Ramparts,	1
Reason,	1
Resting	1
San	1
Seattle,	1
Shakespeare	1
She	8
Siwashes,	1
Some	3
South,	1
Squaws	1
Stars	1
States,	2
Stripes	1
That	2
The	19
Their	1
There	1
There's	1
Therefore,	1
They	5
This	1
Those	1
To	1
Unconsciously	1
Washington,	1
Washington.	1
We	4
What	2
Where	1
Who	1
Why?"	1
Yankee	1
Yankee,"	1
York."	1
You	2
Yukon	1
a	78
abashed76	1
able	1
about	5
about—like	1
absorbing	1
accepting,	1
account,	1
added	2
added,	1
adjust	1
admiring	1
admitted.	1
affairs	1
afire.	1
after	1
afternoon	2
afternoon.	1
again	3
again,	4
against	2
ago,"	1
ahead	1
ain't	2
air,	1
alder38,	1
all	11
all!"	1
all,	2
along	1
aloud,	1
aloud.	1
already	1
already.	1
also,	1
altered	1
although	2
always?"	1
am	5
am,	1
am,"	1
among	3
an	12
and	127
and,	3
and—"	1
angry	1
angry."	1
ankle,	1
announced,	2
another	1
answer	1
answered	1
answered.	1
any	8
anybody	1
anything	1
anywhere	1
appear	1
approaching	2
approaching,	1
approved	1
are	20
are!	1
are,	1
arose.	1
around	3
around?"	1
arrangement	1
arrested,	1
as	31
asked	1
at	33
ate	1
attempt	1
autumn	1
away	3
away,	2
awkwardly,	1
back	5
back,	1
back.	1
back;	1
bacon,	1
bad—and	1
bank	1
banks	1

4. 其他需要了解的

对象存储,如亚马逊的s3,oss的原理。

namenode的单活,怎么来进行的hdfs上面的文件读取,meta数据是什么?

接下来手写wordcount

 

 

 

你可能感兴趣的:(hadoop)