一、先代码准备好。 代码在本文后面
我的hadoop路劲是/Users/chenxun/software/hadoop-2.8.1 所以我在这个建了个自己文件夹myclass目录,把代码放到这个目录下面。如图所示:
[chenxun@chen.local 17:21 ~/software/hadoop-2.8.1/myclass]$ll
total 64
-rw-r--r-- 1 chenxun staff 1017 10 15 15:36 MaxTemperature.java
-rw-r--r-- 1 chenxun staff 977 10 15 15:39 MaxTemperatureMapper.java
-rw-r--r-- 1 chenxun staff 579 10 15 15:39 MaxTemperatureReducer.java
二、配置代码编译环境classpath的值
配置好java环境和hadoop编译需要的hadoop依赖jar包
vim ~/.bash_profile
JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_144.jdk/Contents/Home
CLASSPAHT=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=/Users/chenxun/software/hadoop-2.8.1
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
for f in $HADOOP_HOME/share/hadoop/common/hadoop-*.jar;do
export CLASSPATH=$CLASSPATH:$f
done
for f in $HADOOP_HOME/share/hadoop/hdfs/hadoop-*.jar;do
export CLASSPATH=$CLASSPATH:$f
done
for f in $HADOOP_HOME/share/hadoop/mapreduce/hadoop-*.jar;do
export CLASSPATH=$CLASSPATH:$f
done
for f in $HADOOP_HOME/share/hadoop/yarn/hadoop-*.jar;do
export CLASSPATH=$CLASSPATH:$f
done
export CLASSPATH=$CLASSPATH:$HADOOP_HOME/share/common/lib:$HADOOP_HOME/share/hdfs/lib:$HADOOP_HOME/share/mapreduce/lib:$HADOOP_HOME/share/tools/lib:$HADOOP_HOME/share/yarn/lib
source ~/.bash_profile
三、编译代码和打包成jar包
javac *.java
jar -cvf MaxTemperature.jar .
[[email protected] 17:21 ~/software/hadoop-2.8.1/myclass]$ll
total 64
-rw-r--r-- 1 chenxun staff 1413 10 15 15:40 MaxTemperature.class
-rw-r--r-- 1 chenxun staff 6333 10 15 16:18 MaxTemperature.jar
-rw-r--r-- 1 chenxun staff 1017 10 15 15:36 MaxTemperature.java
-rw-r--r-- 1 chenxun staff 1876 10 15 15:40 MaxTemperatureMapper.class
-rw-r--r-- 1 chenxun staff 977 10 15 15:39 MaxTemperatureMapper.java
-rw-r--r-- 1 chenxun staff 1687 10 15 15:40 MaxTemperatureReducer.class
-rw-r--r-- 1 chenxun staff 579 10 15 15:39 MaxTemperatureReducer.java
四、准备数据
在网站下载hadoop天气数据:ftp://ftp.ncdc.noaa.gov/pub/data/noaa/2010/
我把天气数据放到file.txt中:数据如下
0029227070999991901122820004+62167+030650FM-12+010299999V0200501N003119999999N0000001N9-01561+99999100061ADDGF108991999999999999999999
0029227070999991901122906004+62167+030650FM-12+010299999V0200901N003119999999N0000001N9-01501+99999100181ADDGF108991999999999999999999
0029227070999991901122913004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-01561+99999100271ADDGF104991999999999999999999
0029227070999991901122920004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-02001+99999100501ADDGF107991999999999999999999
0029227070999991901123006004+62167+030650FM-12+010299999V0200701N003119999999N0000001N9-01501+99999100791ADDGF108991999999999999999999
0029227070999991901123013004+62167+030650FM-12+010299999V0200901N003119999999N0000001N9-01331+99999100901ADDGF108991999999999999999999
0029227070999991901123020004+62167+030650FM-12+010299999V0200701N002119999999N0000001N9-01221+99999100831ADDGF108991999999999999999999
0029227070999991901123106004+62167+030650FM-12+010299999V0200701N004119999999N0000001N9-01391+99999100521ADDGF108991999999999999999999
0029227070999991901123113004+62167+030650FM-12+010299999V0200701N003119999999N0000001N9-01391+99999100321ADDGF108991999999999999999999
0029227070999991901123120004+62167+030650FM-12+010299999V0200701N004119999999N0000001N9-01391+99999100281ADDGF108991999999999999999999
建立hdfs数据输入文件路劲
[chenxun@chen.local 16:42 ~/software/hadoop-2.8.1/myclass]$hadoop fs -mkdir -p /user/chenxun/data
[chenxun@chen.local 16:42 ~/software/hadoop-2.8.1/myclass]$hadoop fs -ls /user/chenxun
Found 3 items
drwxr-xr-x - chenxun supergroup 0 2017-10-15 16:42 /user/chenxun/data
drwxr-xr-x - chenxun supergroup 0 2017-10-14 01:54 /user/chenxun/input
drwxr-xr-x - chenxun supergroup 0 2017-10-14 01:55 /user/chenxun/output
把天气数据上传到数据输入路劲下面:
[chenxun@chen.local 16:47 ~/software/hadoop-2.8.1/myclass]$hadoop fs -put ./data/file.txt /user/chenxun/data
[chenxun@chen.local 16:47 ~/software/hadoop-2.8.1/myclass]$hadoop fs -ls /user/chenxun/data
Found 1 items
-rw-r--r-- 1 chenxun supergroup 9855 2017-10-15 16:47 /user/chenxun/data/file.txt
运行代码:
[chenxun@chen.local 17:10 ~/software/hadoop-2.8.1/myclass]$hadoop jar MaxTemperature.jar MaxTemperature /user/chenxun/data/file.txt /user/chenxun/dataoutput
。。。
。。。。。
[chenxun@chen.local 17:11 ~/software/hadoop-2.8.1/myclass]$hadoop fs -ls /user/chenxun/dataoutput
Found 2 items
-rw-r--r-- 1 chenxun supergroup 0 2017-10-15 17:11 /user/chenxun/dataoutput/_SUCCESS
-rw-r--r-- 1 chenxun supergroup 9 2017-10-15 17:11 /user/chenxun/dataoutput/part-r-00000
[chenxun@chen.local 17:11 ~/software/hadoop-2.8.1/myclass]$
[chenxun@chen.local 17:11 ~/software/hadoop-2.8.1/myclass]$
[chenxun@chen.local 17:12 ~/software/hadoop-2.8.1/myclass]$hadoop fs -cat /user/chenxun/dataoutput/part-r-00000
1901 -56
代码:
MaxTemperature.java
import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class MaxTemperature {
public static void main(String[] args) throws Exception {
if (args.length != 2) {
System.err.println("Usage: MaxTemperature );
System.exit(-1);
}
Job job = new Job();
job.setJarByClass(MaxTemperature.class);
job.setJobName("Max temperature");
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
MaxTemperatureMapper.java
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class MaxTemperatureMapper
extends Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
String year = line.substring(15, 19);
int airTemperature;
if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs
airTemperature = Integer.parseInt(line.substring(88, 92));
} else {
airTemperature = Integer.parseInt(line.substring(87, 92));
}
String quality = line.substring(92, 93);
if (airTemperature != MISSING && quality.matches("[01459]")) {
context.write(new Text(year), new IntWritable(airTemperature));
}
}
}
MaxTemperatureReducer.java
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class MaxTemperatureReducer
extends Reducer<Text, IntWritable, Text, IntWritable> {
@Override
public void reduce(Text key, Iterable values,
Context context)
throws IOException, InterruptedException {
int maxValue = Integer.MIN_VALUE;
for (IntWritable value : values) {
maxValue = Math.max(maxValue, value.get());
}
context.write(key, new IntWritable(maxValue));
}
}