Spring Batch-基本概念以及案例

1. Spring Batch的设计图

比较重要的几个domain

  • Job 任务
  • Step 任务里包含的步骤
  • ItemReader 单个步骤里的输入(input)
  • ItemProccesor input的处理
  • ItemWriter 单个步骤里的输出(output)

ItemReader,ItemProccesor,ItemWriter这个类似于java 8里funtional编程

  • public interface Supplier
  • public interface Function
  • public interface Consumer

2. Job

Job 执行后就产生一个JobInstance,好比类和实例的关系,1个JobInstance 可以有好多个 JobExcution

2.1 JobParameter

JobParameters就是job运行的一些算入参数

例如 通过cmd运行一个endOfDay的任务,
传入的参数是schedule.date(date)=2007/05/05,这个会构成一个JobParameter

java CommandLineJobRunner io.spring.EndOfDayJobConfiguration endOfDay schedule.date(date)=2018/05/05

上面的命令说明

CommandLineJobRunner 是spring batch提供的一个具有main方法的类,接收参数如下

  • io.spring.EndOfDayJobConfiguration 是一个Job的Spring 的@configuration类,里面包含了基本的Job的构成step
  • endOfDay是一个Job的定义,Spring的@Bean
  • schedule.date(date)=2007/05/05是参数
2.2 JobInstance 和 JobExcution的关系

下面的命令运行n次都只产生1个JobInstance

java CommandLineJobRunner io.spring.EndOfDayJobConfiguration endOfDay schedule.date(date)=2018/05/05

也就是说 JobInstance = Job + identifying JobParameters

同一个JobInstance不同次运行有不同的JobExcution,JobExcution会记录开始时间,结束时间,状态等的字段,具体的看
JobExcution包含哪些字段

JobInstance 如果有带参数则只能运行一次,如果多次运行则会报

JobInstanceAlreadyCompleteException: A job instance already exists and is complete for parameters={readCountPerTime=10}.  If you want to run this job again, change the parameters.

原因在于SimpleJobRepository创建JobExecution时候会去判断

  1. jobInstance是否有带参数
  2. jobInstance已经运行的jobExecution里是否有状态有COMPLETED
    满足上面2个条件就会抛出异常表示任务已经运行完成,如果要再运行需要更改参数
    public class SimpleJobRepository implements JobRepository {
    public JobExecution createJobExecution(String jobName, JobParameters jobParameters)
    			throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException {
    
    		Assert.notNull(jobName, "Job name must not be null.");
    		Assert.notNull(jobParameters, "JobParameters must not be null.");
    
    		/*
    		 * Find all jobs matching the runtime information.
    		 *
    		 * If this method is transactional, and the isolation level is
    		 * REPEATABLE_READ or better, another launcher trying to start the same
    		 * job in another thread or process will block until this transaction
    		 * has finished.
    		 */
    
    //根据job和jobParameters找到jobInstance
    		JobInstance jobInstance = jobInstanceDao.getJobInstance(jobName, jobParameters);
    		ExecutionContext executionContext;
    
    		// existing job instance found
    		if (jobInstance != null) {
    
    //找到jobInstance的多次执行的JobExecution
    			List executions = jobExecutionDao.findJobExecutions(jobInstance);
    
    			// 遍历查看所有的execution
    			for (JobExecution execution : executions) {
    				if (execution.isRunning() || execution.isStopping()) {
    					throw new JobExecutionAlreadyRunningException("A job execution for this job is already running: "
    							+ jobInstance);
    				}
    				BatchStatus status = execution.getStatus();
    				if (status == BatchStatus.UNKNOWN) {
    					throw new JobRestartException("Cannot restart job from UNKNOWN status. "
    							+ "The last execution ended with a failure that could not be rolled back, "
    							+ "so it may be dangerous to proceed. Manual intervention is probably necessary.");
    				}
    				//如果有一个execution是完成,则直接抛出已经完成的异常
    				if (execution.getJobParameters().getParameters().size() > 0 && (status == BatchStatus.COMPLETED || status == BatchStatus.ABANDONED)) {
    					throw new JobInstanceAlreadyCompleteException(
    							"A job instance already exists and is complete for parameters=" + jobParameters
    							+ ".  If you want to run this job again, change the parameters.");
    				}
    			}
    			executionContext = ecDao.getExecutionContext(jobExecutionDao.getLastJobExecution(jobInstance));
    		}
    		else {
    			// no job found, create one
    			jobInstance = jobInstanceDao.createJobInstance(jobName, jobParameters);
    			executionContext = new ExecutionContext();
    		}
    
    		JobExecution jobExecution = new JobExecution(jobInstance, jobParameters, null);
    		jobExecution.setExecutionContext(executionContext);
    		jobExecution.setLastUpdated(new Date(System.currentTimeMillis()));
    
    		// Save the JobExecution so that it picks up an ID (useful for clients
    		// monitoring asynchronous executions):
    		jobExecutionDao.saveJobExecution(jobExecution);
    		ecDao.saveExecutionContext(jobExecution);
    
    		return jobExecution;
    
    	}
    }

这会带来困扰,如果batch 就是只有一个参数的,例如
java CommandLineJobRunner MyJobConfiguration path=/c/abc.txt,接收一个路径,路径不变,需要重复运行该怎么办?
batch提供了一个JobParametersIncrementer,这是一个接口类,spring-batch给我们提供了一个实现类RunIdIncrementer.

public class RunIdIncrementer  implements JobParametersIncrementer{
    public RunIdIncrementer()    {
        key = RUN_ID_KEY;
    }
    public void setKey(String key)    {
        this.key = key;
    }
    //加入了一个run.id的参数,每运行1次batch就累加1次
    public JobParameters getNext(JobParameters parameters)    {
        JobParameters params = parameters != null ? parameters : new JobParameters();
        long id = params.getLong(key, 0L).longValue() + 1L;
        return (new JobParametersBuilder(params)).addLong(key, Long.valueOf(id)).toJobParameters();
    }

    private static String RUN_ID_KEY = "run.id";
    private String key;

}

简单的讲就是在运行的时候,人为的加入递增id.来绕过参数相同的限制
如何使用?
只需要在定义job的地方加入

@Bean
	public Job testJob(@Qualifier("testStep") Step step) {
		return jobBuilderFactory.get("testJob")
				.incrementer(new RunIdIncrementer())
				.repository(jobRepository)
				.start(step)
				.build();
	}

我们的batch一般都是通过命令行的shell运行的,在看过spring给我们提提供的CommandLineJobRunner,如下

if(opts.contains("-next"))
            jobParameters = (new JobParametersBuilder(jobParameters, jobExplorer)).getNextJobParameters(job).toJobParameters();

也就是加上-next,命令变成 java CommandLineJobRunner MyJobConfiguration path=/c/abc.txt -next

2.3 spring batch的内置表来直观看上面的关系

BATCH_JOB_INSTANCE

JOB_INST_ID JOB_NAME
1 EndOfDayJob
2 EndOfDayJob

BATCH_JOB_EXECUTION_PARAMS

JOB_EXECUTION_ID TYPE_CD KEY_NAME DATE_VAL IDENTIFYING
1 DATE schedule.Date 2017-01-01 00:00:00 true
2 DATE schedule.Date 2017-01-01 00:00:00 true
3 DATE schedule.Date 2017-01-02 00:00:00 true

BATCH_JOB_EXECUTION

JOB_EXECUTION_ID JOB_INST_ID START_TIME END_TIME STATUS
1 1 2017-01-01 21:00 2017-01-01 21:30 FAILED
2 1 2017-01-02 21:00 2017-01-02 21:30 COMPLETED
3 2 2017-01-02 21:31 2017-01-02 22:29 COMPLETED

spring batch 内置表的关系图如下

3. Step

每次step触发后就会产生一个stepExecution,step不像job,是没有stepinstance的。对于stepExecution,对应的表有

  • BATCH_STEP_EXECUTION 用来记录开始时间,结束时间,状态等字段记录
  • BATCH_STEP_EXECUTION_CONTEXT 通过
    executionContext.putLong(getKey(LINES_READ_COUNT), reader.getPosition());可以表里存入一些记录

4.Sample

从一个文件中读取数据,封装成Person对象并打印

Person.txt

1,Rechard,20
2,James,30
3,Cury,28
4,Durant,26

Person.java

public class Person {
    private int id;
    private String name;
    private int age;
    //getter and setter
    
}

Batch的@Configuration

Job里配置一个Step

  • step里的reader 从文件Person.txt中读数据,每一行代表 Person的信息,封装成Person对象
  • step里的Processor
    将Person打印出来
@Configuration
@ComponentScan("rechard.learn.springbatch.sample.simple")
@EnableBatchProcessing
public class SimpleBatchConfiguration {

    @Autowired
    JobBuilderFactory jobBuilders;

    @Autowired
    private StepBuilderFactory steps;

    @Autowired

//配置1个Job,job里只有1个step
    @Bean
    public Job simpleJob(Step step){
        return jobBuilders.get("simpleJob").start(step).build();
    }
//step 里的reader和writer
    @Bean
    protected Step step(ItemReader reader,
                         ItemWriter writer) {
        return steps.get("step1")
                . chunk(10)
                .reader(reader)
                .writer(writer)
                .build();
    }
    
    //reader 使用sprinb batch内置的FlatFileItemReader,读取1行并封装成为1个Person 对象
    @Bean
    protected ItemReader reader(){
        FlatFileItemReader reader=new FlatFileItemReader();
        FileInputStream fis = null;
        try {
            fis = new FileInputStream(new File("E:\\person.txt"));
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
        reader.setResource( new InputStreamResource(fis));
        reader.setLineMapper((line,number)->{
            String[] str = line.split(",");
            Person p = new Person();
            p.setId(Integer.parseInt(str[0]));
            p.setName(str[1]);
            p.setAge(Integer.parseInt(str[2]));
            return p;
        });
        return reader;
    }
//reader 则简单的打印出来,这里是ItemWriterAdapter,这个类主要是设置一个代理类来帮助打印,代理类就是真实的处理Person逻辑的类PersonProcessor
    @Bean
    protected ItemWriter writer(){
        ItemWriterAdapter adapter = new ItemWriterAdapter();
        adapter.setTargetMethod("print");
        adapter.setTargetObject(new PersonProcessor());
        return adapter;
    }

}

PersonProcessor

public class PersonProcessor {
    public void print(Person p){
        System.out.println(p.toString());
    }
}

main的启动类

public class SimpleDemo {
    public static void main(String[] args) {
        AnnotationConfigApplicationContext ctx =  new AnnotationConfigApplicationContext();
        ctx.register(SimpleBatchConfiguration.class);
        ctx.refresh();
        JobLauncher launcher = (JobLauncher)ctx.getBean("jobLauncher");
        JobParameters parameters = new JobParameters();
        try {
            launcher.run((Job)ctx.getBean("simpleJob"),parameters);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

打印出来的结果

Person{id=1, name='Rechard', age=20}
Person{id=2, name='James', age=30}
Person{id=3, name='Cury', age=28}
Person{id=4, name='Durant', age=26}

5.坑

定义ItemReader如下

  @Bean()
  @StepScope
	public ItemReader itemReader(@Value("#{jobParameters[a]}") String affiliate) {	
		String sql ="select * from PENDINGUSER WHERE AFFILIATEID=?";	
	    return  new JdbcCursorItemReaderBuilder().name("dataSourceReader").dataSource(dataSource)
	                .sql(sql)
	                .preparedStatementSetter(ps->ps.setString(1, affiliate.toUpperCase()))
	                .rowMapper((rs, index)->{
	                    return rs.getInt("SSMREQUESTKY")+"";
	                }).build();
	}

一直报以下错误

org.springframework.batch.item.ReaderNotOpenException: Reader must be open before it can be read.

解决办法,将返回值从接口改成实际类

public ItemReader JdbcCursorItemReader(@Value("#{jobParameters[a]}") String affiliate) {
}

你可能感兴趣的:(spring)