spring batch批处理分区读取数据库的例子

在spring batch批处理框架中,分区partitioning是指用多线程读取一定范围内的数据。比如,你的数据库表中有100条记录,有一个主键id,取值从1到100,你想处理这100条记录.通常,使用1个线程,从1开始,一直到100.处理过程大概需要10分钟。

如果使用分区的话,我们可以启动10个线程,每个线程处理10条记录(按照id的范围)。这样,只用1分钟就能处理完。

为了实现分区技术,你必须懂得所输入数据的结构,这样你才能恰当地划定数据范围。

本文演示如何创建一个分区的工作,该工作有10个线程,每个线程都按照提供的id范围读取数据库.

create table users(
id int(6) not null auto_increment,                  
use_login varchar(50),           
user_pass varchar(50),
age int(6),
primary key(id)
)default charset=utf8;
首先创建一个实现了Partitioner接口的类,将分区范围partioning rang放到ExecutionContext中去。之后,在batch-job.xml文件
中声明同样的fromId和toId.
package com.mkyong.partition;

import java.util.HashMap;
import java.util.Map;

import org.springframework.batch.core.partition.support.Partitioner;
import org.springframework.batch.item.ExecutionContext;

public class RangePartitioner implements Partitioner {

	@Override
	public Map partition(int gridSize) {

		Map result = new HashMap();

		int range = 10;
		int fromId = 1;
		int toId = range;

		for (int i = 1; i <= gridSize; i++) {
			ExecutionContext value = new ExecutionContext();

			System.out.println("\nStarting : Thread" + i);
			System.out.println("fromId : " + fromId);
			System.out.println("toId : " + toId);

			value.putInt("fromId", fromId);
			value.putInt("toId", toId);

			// give each thread a name
			value.putString("name", "Thread" + i);

			result.put("partition" + i, value);

			fromId = toId + 1;
			toId += range;

		}

		return result;
	}

}
读一下batch job XML文件,大多数条目都是不言自明的。只需要指出以下几点:
1.对于分区者partioner来讲,grid-size=number of threads
2.对于pagingItemReader bean来讲, 即jdbc读取者的例子,#{stepExecutionContext[fromId,toId]}的值将通过ExecutionContext注入到rangePartioner中。
3.对于itemProcessor bean来讲,#{stepExecutionContext[name]}的值将由ExecutionContext注入到rangePartitioner中。

4.对于writers,每个线程都将输出记录到不同的csv文件中,文件名的格式是users.processed[fromId]}-[toId].csv

job-partioner.xml


	
	
	
	

	
	
	    
	    
		
			
				
			
		
		
	

	
	
		
			
		
	

	

	

	
		
	

	
		
		
			
				
				
				
				
				
			
		
		
		
			
				
				
			
		
		
		
			
		
	

	
	
		
		
		
			
				
				
					
						
					
				
			
		
	

	
		
		
		
		
			
				
			
		
	

条目处理类用来输出正在处理的条目和当前正在运行的线程的名字。

import com.mkyong.User;

@Component("itemProcessor")
@Scope(value = "step")
public class UserProcessor implements ItemProcessor {

	@Value("#{stepExecutionContext[name]}")
	private String threadName;

	@Override
	public User process(User item) throws Exception {

		System.out.println(threadName + " processing : " + item.getId() + " : " + item.getUsername());

		return item;
	}

	public String getThreadName() {
		return threadName;
	}

	public void setThreadName(String threadName) {
		this.threadName = threadName;
	}

}
加载所有的东西,跑一下试试。10个线程将被启动,加工各自范围的数据。

public class App {

	public static void main(String[] args) {

		App obj = new App();
		obj.run();

	}

	private void run() {

		String[] springConfig = { "spring/batch/jobs/job-partitioner.xml" };

		ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);

		JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
		Job job = (Job) context.getBean("partitionJob");

		try {

			//JobParameters param = new JobParametersBuilder().addString("age", "20").toJobParameters();

			JobExecution execution = jobLauncher.run(job, new JobParameters());
			System.out.println("Exit Status : " + execution.getStatus());
			System.out.println("Exit Status : " + execution.getAllFailureExceptions());

		} catch (Exception e) {
			e.printStackTrace();
		}

		System.out.println("Done");

	}

}

console的输出:

Starting : Thread1
fromId : 1
toId : 10


Starting : Thread2
fromId : 11
toId : 20


Starting : Thread3
fromId : 21
toId : 30

译者注:在跑这个程序之前,要先用下面程序给数据库灌数据:

package com.mkyong.data;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Random;

public class InsertData {

	private static final String DB_DRIVER = "com.mysql.jdbc.Driver";
	private static final String DB_CONNECTION = "jdbc:mysql://localhost:3306/test";
	private static final String DB_USER = "root";
	private static final String DB_PASSWORD = "";

	public static void main(String[] argv) {

		try {

			insertDataIntoUserTable();

		} catch (Exception e) {

			System.out.println(e.getMessage());

		}

	}

	private static void insertDataIntoUserTable() throws Exception {

		Connection dbConnection = null;
		Statement statement = null;

		try {

			dbConnection = getDBConnection();
			statement = dbConnection.createStatement();
			
			Random generator = new Random();

			for (int i = 1; i <= 100; i++) {

				String insertTableSQL = "INSERT INTO USERS (ID, USER_LOGIN, USER_PASS, AGE) VALUES (':id',':name',':pass',':age')";

				insertTableSQL = insertTableSQL.replaceAll(":id", String.valueOf(i));
				insertTableSQL = insertTableSQL.replaceAll(":name", "user_" + i);
				insertTableSQL = insertTableSQL.replaceAll(":pass", "pass_" + i);

				//insertTableSQL = insertTableSQL.replaceAll(":age", String.valueOf(10 + (int) (Math.random() * 90)));
				insertTableSQL = insertTableSQL.replaceAll(":age", String.valueOf(generator.nextInt(100)));
				
				System.out.println(insertTableSQL);

				statement.addBatch(insertTableSQL);

			}

			statement.executeBatch();

			System.out.println("Record is inserted into USER table!");

		} catch (Exception e) {

			System.out.println(e.getMessage());

		} finally {

			if (statement != null) {
				statement.close();
			}

			if (dbConnection != null) {
				dbConnection.close();
			}

		}

	}

	private static Connection getDBConnection() {

		Connection dbConnection = null;

		try {

			Class.forName(DB_DRIVER);

		} catch (ClassNotFoundException e) {

			System.out.println(e.getMessage());

		}

		try {

			dbConnection = DriverManager.getConnection(DB_CONNECTION, DB_USER, DB_PASSWORD);
			return dbConnection;

		} catch (SQLException e) {

			System.out.println(e.getMessage());

		}

		return dbConnection;

	}

}

原文: http://www.mkyong.com/spring-batch/spring-batch-partitioning-example/

源代码 :http://pan.baidu.com/share/link?shareid=3720169482&uk=3878681452
输出:

spring batch批处理分区读取数据库的例子_第1张图片


你可能感兴趣的:(Spring)