mapreduce实现数据库输出

如果mapreduce需要实现数据库输出,需要定义数据表实体类,在jobconf中设置好数据库驱动类,数据库连接参数,并将reduce的输出key设置为数据表的实体类

1. 数据表实体类:

数据表实体类需要实现Writable, DBWritable这两个接口,并实现以下方法

public void write(PreparedStatement statement) throws SQLException{
    //设置sql参数
    int index = 1;
    statement.setLong(index++, this.getAppId());
    statement.setString(index++, this.getVersion());
    statement.setLong(index++, this.getUserId());
}

2. 启动类中设置conf配置项

Configuration conf = new Configuration();
conf.set(DBConfiguration.DRIVER_CLASS_PROPERTY, ApplicationConfig.JdbcDriver);//驱动类
conf.set(DBConfiguration.URL_PROPERTY, ApplicationConfig.BigDataDbUrl);//数据库连接地址
conf.set(DBConfiguration.USERNAME_PROPERTY, ApplicationConfig.BigDataUserName);//用户名
conf.set(DBConfiguration.PASSWORD_PROPERTY, ApplicationConfig.BigDataPassword);//密码
Job job = new Job(conf, "actionLog");

3. 配置输出格式为数据库,DBOutputFormat只能做插入操作,如果需要自定义sql语句,需要重写outputFormat

job.setOutputFormatClass(DBOutputFormat.class);

4. 配置输出数据表及参数

DBOutputFormat.setOutput(job, "taleb_name",String ["field1","field2"...]);

5. 设置reduce输出key格式为数据库实体类,value格式为NullWritable

job.setOutputKeyClass(EventCountBreadModel.class);
job.setOutputValueClass(NullWritable.class);

6. 在reduce中输出数据表实体

protected void reduce(Text key, Iterable values,
                          Reducer.Context context) throws IOException, InterruptedException {
            EventCountBreadModel model = new EventCountBreadModel();        
            context.write(model, NullWritable.get());
}

你可能感兴趣的:(hadoop)