本篇不探讨底层原理和一些代码含义,只介绍一些使用场景。配置接上篇。上篇介绍了一下Spring-batch的一些读取写入操作,本篇主要介绍一下spring-batch的ItemProcessor
ItemProcessor用于处理业务逻辑,验证,过滤等功能
自定义一个Processor实现,实现ItemProcessor接口,将Customer的age+1.
@Component("firstNameProcessor")
public class FirstNameProcessor implements ItemProcessor<Customer,Customer> {
@Override
public Customer process(Customer item) throws Exception {
item.setAge(item.getAge()+1);
return item;
}
}
将Processor配置进Step中:
@Bean
public Step step1() {
return stepBuilderFactory.get("itemReaderDemoStep3")
.<Customer,Customer>chunk(2)
.reader(dbReader())
.processor(firstNameProcessor)
.writer(itemWriter())
.build();
}
在上面依据有一个FirstNameProcessor的基础上再定义一个Processor:
IdFilterProcessor.java
@Component("idFilterProcessor")
public class IdFilterProcessor implements ItemProcessor<Customer,Customer> {
@Override
public Customer process(Customer item) throws Exception {
if(item.getCid()%2==0){
return item;
}
return null;
}
}
将两个Processor装配进CompositeItemProcessor
@Bean
public CompositeItemProcessor<Customer,Customer> itemProcessor(){
CompositeItemProcessor<Customer,Customer> itemProcessor = new CompositeItemProcessor<>();
List<ItemProcessor<Customer,Customer>> delegates = new ArrayList<>();
delegates.add(idFilterProcessor);
delegates.add(firstNameProcessor);
itemProcessor.setDelegates(delegates);
return itemProcessor;
}
将CompositeItemProcessor配置进Step
@Bean
public Step step1() {
return stepBuilderFactory.get("itemReaderDemoStep3")
.<Customer,Customer>chunk(2)
.reader(dbReader())
.processor(itemProcessor())
.writer(itemWriter())
.build();
}
本段转载自陈晨辰~大佬的博客
@Component("restartDemoReader")
public class RestartDemoReader implements ItemStreamReader<Customer> {
// 记录当前读取的行数
private Long curLine = 0L;
// 重启状态初始值
private boolean restart = false;
private FlatFileItemReader<Customer> reader = new FlatFileItemReader<>();
// 持久化信息到数据库
private ExecutionContext executionContext;
public RestartDemoReader() {
reader.setResource(new ClassPathResource("Customer.txt"));
//跳过第一行
reader.setLinesToSkip(1);
//解析数据
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setNames(new String[]{"cid","name","age","Birthday"});
//将一行数据映射为对象
DefaultLineMapper<Customer> mapper = new DefaultLineMapper<>();
mapper.setLineTokenizer(tokenizer);
mapper.setFieldSetMapper(new FieldSetMapper<Customer>() {
@Override
public Customer mapFieldSet(FieldSet fieldSet) throws BindException {
Customer customer = new Customer();
customer.setCid(fieldSet.readInt("cid"));
customer.setName(fieldSet.readString("name"));
customer.setAge(fieldSet.readInt("age"));
customer.setBirthday(fieldSet.readDate("Birthday"));
return customer;
}
});
mapper.afterPropertiesSet();
reader.setLineMapper(mapper);
}
@Override
public Customer read() throws Exception, UnexpectedInputException, ParseException,
NonTransientResourceException {
Customer customer = null;
this.curLine++;
//如果是重启,则从上一步读取的行数继续往下执行
if (restart) {
reader.setLinesToSkip(this.curLine.intValue()-1);
restart = false;
System.out.println("Start reading from line: " + this.curLine);
}
reader.open(this.executionContext);
customer = reader.read();
//当匹配到wrongName时,显示抛出异常,终止程序
if (customer != null) {
if (customer.getName().equals("C"))
throw new RuntimeException("Something wrong. Customer id: " + customer.getCid());
} else {
curLine--;
}
return customer;
}
/**
* 判断是否是重启job
* @param executionContext
* @throws ItemStreamException
*/
@Override
public void open(ExecutionContext executionContext) throws ItemStreamException {
this.executionContext = executionContext;
// 如果是重启job,从数据库读取重启的行数,从重启行数开始重新执行
if (executionContext.containsKey("curLine")) {
this.curLine = executionContext.getLong("curLine");
this.restart = true;
}
// 如果不是重启job,初始化行数,从第一行开始执行
else {
this.curLine = 0L;
executionContext.put("curLine", this.curLine.intValue());
}
}
@Override
public void update(ExecutionContext executionContext) throws ItemStreamException {
// 每执行完一个批次chunk,打印当前行数
System.out.println("update curLine: " + this.curLine);
executionContext.put("curLine", this.curLine);
}
@Override
public void close() throws ItemStreamException {
}
}
运行结果:
当第一次执行时,程序在3行抛出异常异常,curline值是2;这时,可以查询数据库 batch_step_excution_context表,发现curline值已经以 键值对形式,持久化进数据库
更新报错行的信息,再次执行程序,程序会执行open方法,判断数据库step中map是否存在curline,如果存在,程序将继续完成上次任务剩下的操作
@Bean
@StepScope
public Tasklet errorHandling(){
return new Tasklet(){
@Override
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
Map<String, Object> stepExecutionContext = chunkContext.getStepContext().getStepExecutionContext();
if(stepExecutionContext.containsKey("qianfeng")){
System.out.println("符合条件继续执行");
return RepeatStatus.FINISHED;
}else{
System.out.println("不符合条件执行失败");
//这个集合是不可变的
chunkContext.getStepContext().getStepExecutionContext().put("qianfeng",true);
throw new RuntimeException("出错了");
}
}
};
}
public class CustomerException extends Exception {
public CustomerException(String msg){
super(msg);
}
}
再在自定义的firstNameProcessor类中抛出异常
@Component("firstNameProcessor")
public class FirstNameProcessor implements ItemProcessor<Customer,Customer> {
int count = 0 ;
@Override
public Customer process(Customer item) throws Exception {
item.setAge(item.getAge()+1);
count++;
if(count%2==0){
throw new CustomerException("不知道为什么,就想抛个异常");
}
return item;
}
}
在Step中对这个异常配置处理规则
@Bean
public Step step1() throws CustomerException {
return stepBuilderFactory.get("itemReaderDemoStep4")
//.tasklet(errorHandling())
.<Customer,Customer>chunk(2)
.reader(dbReader())
.processor(firstNameProcessor)
.writer(itemWriter())
.faultTolerant()
.retry(CustomerException.class)
.retryPolicy(new AlwaysRetryPolicy())
.retryLimit(10)
.build();
}
运行可以看到当异常发生,spring-batch会retry整个step配置的retryLimit次
@Bean
public Step step1() throws CustomerException {
return stepBuilderFactory.get("itemReaderDemoStep4")
//.tasklet(errorHandling())
.<Customer,Customer>chunk(2)
.reader(dbReader())
.processor(firstNameProcessor)
.writer(itemWriter())
.faultTolerant()
.skip(CustomerException.class)
.skipLimit(1)
.build();
}
skip跳过发生异常的数据,最多跳过skipLimit条
如果我们想要记录跳过的数据信息,可以配置监听器,代码如下:
创建skipListener监听器,实现SkipListener接口:
MySkipListener.java
@Component("mySkipListener")
public class MySkipListener implements SkipListener<Customer,Customer> {
@Override
public void onSkipInRead(Throwable t) {
System.out.println("skipped when read");
}
@Override
public void onSkipInWrite(Customer item, Throwable t) {
System.out.println("when write skipped " + item);
}
@Override
public void onSkipInProcess(Customer item, Throwable t) {
System.out.println("when process skipped " + item);
}
}
配置监听器:
@Autowired
@Qualifier("mySkipListener")
private SkipListener<Customer,Customer> mySkipListener;
@Bean
public Step step1() throws CustomerException {
return stepBuilderFactory.get("itemReaderDemoStep4")
//.tasklet(errorHandling())
.<Customer,Customer>chunk(2)
.reader(dbReader())
.processor(firstNameProcessor)
.writer(itemWriter())
.faultTolerant()
.skip(CustomerException.class)
.skipLimit(10)
.listener(mySkipListener)
.build();
}
运行结果如下:
之所以会出现这样的结果,是因为skip底层还是使用了retry,当一个chunk中某条数据产生了异常,就会跳过这条数据同时retry当前chunk
可以看到这样的结果
监听器在step执行完之后才会执行,也就是执行是单线程的
org.springframework.boot
spring-boot-starter-web
server.port= 9386
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql:///springbatch?useUnicode=true&characterEncoding=utf8&useSSL=false
spring.datasource.username=root
spring.datasource.password=123456
#springbatch会将任务执行记录持久到数据库,参考的sql语句来源于此
spring.datasource.schema=classpath:/org/springframework/batch/core/schema-mysql.sql
spring.batch.initialize-schema=always
spring.batch.job.enabled=false
#spring.batch.job.names=parentJob
@Configuration
public class JobLauncherDemo implements StepExecutionListener {
//保存参数
private Map<String, JobParameter> parameters;
@Autowired
private StepBuilderFactory stepBuilderFactory;
@Autowired
private JobBuilderFactory jobBuilderFactory;
@Override
public void beforeStep(StepExecution stepExecution) {
parameters = stepExecution.getJobParameters().getParameters();
}
@Override
public ExitStatus afterStep(StepExecution stepExecution) {
return null;
}
//使用参数
@Bean
public Step jobLauncherDemoStep(){
return stepBuilderFactory.get("jobLauncherDemoStep")
.tasklet(new Tasklet() {
@Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
System.out.println(chunkContext.getStepContext().getJobParameters());
return RepeatStatus.FINISHED;
}
}).build();
}
@Bean
public Job jobLauncherDemoJob(){
return jobBuilderFactory.get("jobLauncherDemoJob2").start(jobLauncherDemoStep()).build();
}
}
@RestController
public class JobLauncherController {
@Autowired
private JobLauncher jobLauncher;
@Autowired
private Job jobLauncherDemoJob;
@RequestMapping("/job/{msg}")
public String runjob(@PathVariable String msg) throws JobParametersInvalidException, JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException {
JobParameters parameters = new JobParametersBuilder()
.addString("msg",msg)
.toJobParameters();
JobExecution run = jobLauncher.run(jobLauncherDemoJob, parameters);
return "success : "+run.toString();
}
}
JobOperator 最常见的作用莫过于停止某个Job,示例:
Set<Long> executions = jobOperator.getRunningExecutions("sampleJob");
jobOperator.stop(executions.iterator().next());
关闭不是立即发生的,因为没有办法将一个任务立刻强制停掉,尤其是当任务进行到开发人员自己的代码段时,框架在此刻是无能为力的,比如某个业务逻辑处理。而一旦控制权还给了框架,它会立刻设置当前 StepExecution 为 BachStatus.STOPPED ,意为停止,然后保存,最后在完成前对JobExecution进行相同的操作。
JobLauncherController.java
@RestController
public class JobLauncherController {
@Autowired
private JobOperator jobOperator;
@RequestMapping("/job/{msg}")
public String runjob(@PathVariable String msg) throws JobParametersInvalidException, JobInstanceAlreadyExistsException, NoSuchJobException {
jobOperator.start("jobLauncherDemoJob4","msg="+msg);
return "success";
}
}
JobLauncherDemo.java
@Configuration
public class JobLauncherDemo implements StepExecutionListener, ApplicationContextAware {
//保存参数
private Map<String, JobParameter> parameters;
@Autowired
private StepBuilderFactory stepBuilderFactory;
@Autowired
private JobBuilderFactory jobBuilderFactory;
@Override
public void beforeStep(StepExecution stepExecution) {
parameters = stepExecution.getJobParameters().getParameters();
}
@Override
public ExitStatus afterStep(StepExecution stepExecution) {
return null;
}
//使用参数
@Bean
public Step jobLauncherDemoStep(){
return stepBuilderFactory.get("jobLauncherDemoStep")
.tasklet(new Tasklet() {
@Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
System.out.println(chunkContext.getStepContext().getJobParameters());
return RepeatStatus.FINISHED;
}
}).build();
}
@Bean
public Job jobLauncherDemoJob(){
return jobBuilderFactory.get("jobLauncherDemoJob4").start(jobLauncherDemoStep()).build();
}
@Autowired
private JobLauncher jobLauncher;
@Autowired
private JobRepository jobRepository;
@Autowired
private JobExplorer jobExplorer;
@Autowired
private JobRegistry jobRegistry;
private ApplicationContext context;
@Bean
public JobOperator jobOperator(){
SimpleJobOperator jobOperator = new SimpleJobOperator();
jobOperator.setJobLauncher(jobLauncher);
jobOperator.setJobParametersConverter(new DefaultJobParametersConverter());
jobOperator.setJobRepository(jobRepository);
jobOperator.setJobExplorer(jobExplorer);
jobOperator.setJobRegistry(jobRegistry);
return jobOperator;
}
@Bean
public JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor(){
JobRegistryBeanPostProcessor processor = new JobRegistryBeanPostProcessor();
processor.setJobRegistry(jobRegistry);
processor.setBeanFactory(context.getAutowireCapableBeanFactory());
return processor;
}
@Override
public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
this.context = applicationContext;
}
}
jobRepository和JobExplore已经在容器中,JobRegistry需要自己创建
JobOperator本质上还是使用JobLauncher已经创建的Job进行操作,所以JobLauncher的jobid需要唯一,同时如果JobOperator操作的JobId还未创建,则会报未配置错误