hadoop1.0.4
有时在编写Hadoop的MR的时候,会想到如果在Configuration里面可以设置一个类变量多好呀。查看Configuration的api可以看到,一般set方法都是set一般数据类型,比如int,string或者double之类的。那有没有一个方法设置一个自定义类的变量呢,比如setClass,还真别说,还真有这个方法。
查看api:
setClass
public void setClass(String name,
Class> theClass,
Class> xface)
Set the value of the name property to the name of a theClass implementing the given interface xface. An exception is thrown if theClass does not implement the interface xface.
Parameters:
name - property name.
theClass - property value.
xface - the interface implemented by the named class.
但是,感觉不怎么对的?网上查了下,发现这个是设置类型的,用法如下:
/**
* 根据变量获得数据类型
*/
public static void testSetClass(){
Configuration conf= new Configuration();
conf.setClass("mapout", LongWritable.class , Writable.class);
Class> b=conf.getClass("mapout", Writable.class);
System.out.println(b);
conf.setClass("myClass", User.class, Person.class);
b=conf.getClass("myClass", Person.class);
System.out.println(b);
}
这样打印出来的是(User是Person的子类):
class org.apache.hadoop.io.LongWritable
class org.fz.testconfig.User
这个其实就是设置类型用的,感觉用处不是很大。如果想在Mapper或者Reducer的setup函数中获得某个类型,那么其实也可以使用下面的方式:
/**
* 根据包路径获得数据类型
* @throws ClassNotFoundException
*/
public static void testGetClassByName() throws ClassNotFoundException{
Configuration conf= new Configuration();
Class> a=conf.getClassByName("org.apache.hadoop.io.LongWritable");
System.out.println(a);
a=conf.getClassByName("org.fz.testconfig.User");
System.out.println(a);
}
这样打印出来的是:
class org.apache.hadoop.io.LongWritable
class org.fz.testconfig.User
使用getClassByName也可以获得类,但是却获得不到类变量。如果我想获得类变量应该如何做呢?
类变量可以转换为Json字符串,字符串可以使用configuration的set方法设置,然后再使用get方法获得。所以,可以写一个中间的转换类,专门把类转换为字符串,同时还可以把字符串转换为一个类。这里使用的json是阿里巴巴的,可以在这里下载:https://github.com/alibaba/fastjson。
转换类如下:
package org.fz.testconfig;
import org.apache.hadoop.conf.Configuration;
public class ConfigurationUtil {
/**
* Configuration设置自定义类数据
* @param key 变量名
* @param conf Configuration
* @param userDefineObject 自定义类
*/
public static void setClass(String key,Configuration conf,Object userDefineObject ){
String userStr = com.alibaba.fastjson.JSON.toJSON(userDefineObject).toString();
conf.set(key, userStr);
}
/**
* Configuration 获得自定义数据类
* @param key 变量名
* @param conf Configuration
* @param classType 返回值类型
* @return
*/
public static Object getClass(String key,Configuration conf,Class> classType){
String str=conf.get(key);
Object object =com.alibaba.fastjson.JSON.parseObject(str, classType);
return object;
}
}
/**
* 获得自定义类数据
*/
public static void testSetClassReal(){
Configuration conf= new Configuration();
User u = new User("test",11);
/* String userStr = com.alibaba.fastjson.JSON.toJSON(u).toString();
conf.set("USER", userStr);*/
ConfigurationUtil.setClass("USER", conf, u);
/*String returnStr=conf.get("USER");
User user =com.alibaba.fastjson.JSON.parseObject(returnStr, User.class);*/
User user=(User) ConfigurationUtil.getClass("USER", conf, User.class);
System.out.println(user.getAge());
System.out.println(user.getPersons());
}
11
[org.fz.testconfig.Person@10358032]
为了整体完整性,贴上Person和User的代码吧:
package org.fz.testconfig;
import java.util.ArrayList;
import java.util.List;
public class User extends Person{
private String name;
private int age;
private int[] arr;
private List persons= new ArrayList();
public User(){}
public User(String name,int age){
this.age=age;
this.name=name;
this.persons.add(new Person("p1"));
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public int[] getArr() {
return arr;
}
public void setArr(int[] arr) {
this.arr = arr;
}
public List getPersons() {
return persons;
}
public void setPersons(List persons) {
this.persons = persons;
}
}
package org.fz.testconfig;
public class Person {
private String name;
public Person(){}
public Person(String name){this.name=name;}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
编写下面的测试类:
package org.fz.testconfig;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class LoadDriverTest {
private static Logger logger = LoggerFactory.getLogger(LoadDriverTest.class);
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException{
if(args.length!=2){
return ;
}
Configuration conf= new Configuration();
conf.set("mapred.job.tracker", "master:9001");
conf.set("fs.default.name", "master:9000");
User user= new User("user1",22);
ConfigurationUtil.setClass("USER", conf, user);
Job job=new Job(conf,"text2vectorWritable with input:"+args[0]);
job.setMapperClass(LoadMapper.class);
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(Text.class);
job.setNumReduceTasks(0);
job.setJarByClass(LoadDriverTest.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
if (!job.waitForCompletion(true)) {
throw new InterruptedException("Text to VectorWritable Job failed processing " + args[0]);
}
}
public static class LoadMapper extends Mapper{
private User user=null;
@Override
public void setup(Context cxt){
user=(User)ConfigurationUtil.getClass("USER", cxt.getConfiguration(), User.class);
logger.info("user:"+user.getAge()+","+user.getName());
}
@Override
public void map(LongWritable key,Text value,Context cxt) throws IOException,InterruptedException{
String v=user.getName()+","+user.getAge()+","+user.getPersons().toString();
cxt.write(key, new Text(v));
}
}
}
打包放入hadoop云平台lib包下,同时刚才下载的json包也放在lib下面,运行,然后查看这个job任务的log信息以及输出文件。
首先是log信息:
这里可以看到确实是读取到了user的信息,到这里后,其实已经不用再次查看输出信息了,不过还是看一眼吧:
看到也是一样的;
如果您觉得lz的blog或者资源还ok的话,可以选择给lz投一票,多谢。(投票地址:http://vote.blog.csdn.net/blogstaritem/blogstar2013/fansy1990 )
分享,成长,快乐
转载请注明blog地址:http://blog.csdn.net/fansy1990