Batch用来做大数据处理,是一项不错的选择,由于公司的整体架构是Spring Boot,因此自己研究了一下两者之间的关系。
1.在官网http://start.spring.io/,选择MYSQL,BATCH,WEB
2.自定义MyBatchConfig类,添加注解@Configuration--配置注解,@EnableBatchProcessing--batch注解,相关代码如下:
package com.kmm.config; import com.kmm.bean.Person; import com.kmm.listener.MyJobListener; import org.springframework.batch.core.Job; import org.springframework.batch.core.Step; import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing; import org.springframework.batch.core.configuration.annotation.JobBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepBuilderFactory; import org.springframework.batch.core.launch.support.RunIdIncrementer; import org.springframework.batch.core.launch.support.SimpleJobLauncher; import org.springframework.batch.core.repository.JobRepository; import org.springframework.batch.core.repository.support.JobRepositoryFactoryBean; import org.springframework.batch.item.*; import org.springframework.batch.support.DatabaseType; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.transaction.PlatformTransactionManager; import javax.sql.DataSource; import java.util.List; @Configuration @EnableBatchProcessing public class MyBatchConfig { @Bean public ItemReaderreader() throws Exception{ ItemReader itemReader = new ItemReader () { @Override public Person read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException { System.out.println("reader"); return new Person(); } }; return itemReader; } @Bean public ItemProcessor processor(){ ItemProcessor processor = new ItemProcessor () { @Override public Person process(Person person) throws Exception { System.out.println("processor"); return new Person(); } }; return processor; } @Bean public ItemWriter writer(){ ItemWriter itemWriter = new ItemWriter () { @Override public void write(List extends Person> list) throws Exception { System.out.println("writer"); } }; return itemWriter; } @Bean public JobRepository jobRepository(DataSource dataSource, PlatformTransactionManager transactionManager) throws Exception{ JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean(); jobRepositoryFactoryBean.setDataSource(dataSource); jobRepositoryFactoryBean.setTransactionManager(transactionManager); jobRepositoryFactoryBean.setDatabaseType(DatabaseType.MYSQL.name()); return jobRepositoryFactoryBean.getObject(); } // @Bean public SimpleJobLauncher jobLauncher(DataSource dataSource, PlatformTransactionManager transactionManager) throws Exception{ SimpleJobLauncher jobLauncher = new SimpleJobLauncher(); jobLauncher.setJobRepository(this.jobRepository(dataSource, transactionManager)); return jobLauncher; } @Bean public Job myJob(JobBuilderFactory jobs, Step step){ return jobs.get("myJob1") .incrementer(new RunIdIncrementer()) .flow(step) // 为Job指定Step .end() .listener(myJobListener()) // 绑定监听器 .build(); } @Bean public Step step(StepBuilderFactory stepBuilderFactory, ItemReader reader, ItemWriter writer, ItemProcessor processor){ return stepBuilderFactory.get("MyStep") . chunk(5000) // 批处理每次提交5000条数据 .reader(reader) // 给step绑定reader .processor(processor) // 给step绑定processor .writer(writer) // 给step绑定writer .build(); } @Bean public MyJobListener myJobListener(){ return new MyJobListener(); } }
Batch的流程为,reader,processor,writer;
processor可使用自定义,内容如下:
import com.kmm.bean.Person; import org.springframework.batch.item.ItemProcessor; public class MyProcessor implements ItemProcessor{ @Override public Person process(Person person) throws Exception { return person; } }
reader和writer同理
3.使用Controller来测试该job,内容如下:
import org.springframework.batch.core.Job; import org.springframework.batch.core.JobParameters; import org.springframework.batch.core.JobParametersBuilder; import org.springframework.batch.core.launch.support.SimpleJobLauncher; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; @RestController public class DemoController { @Autowired SimpleJobLauncher jobLauncher; @Autowired Job importJob; public JobParameters jobParameters; @RequestMapping("/test") public void imp() throws Exception{ jobParameters = new JobParametersBuilder() .addLong("time",System.currentTimeMillis()) .toJobParameters(); jobLauncher.run(importJob,jobParameters); } }
4.配置mysql数据源(yaml文件类型):
spring: datasource: driver-class-name: com.mysql.jdbc.Driver url: jdbc:mysql://localhost:3306/batch?useSSL=false username: root password: root
5.配置batch(yaml文件类型)
spring: batch: initialize-schema: always job: enabled: false
spring.batch.initialize-shcema:always 初始化数据库(boot2和boot1,初始化数据库有区别)
spring.batch.job.enabled:false 项目启动时不执行job
6.初始化的数据库表:
7.运行结果:
8.如果考虑定时执行该job,可加@Scheduled(cron = "0 0/5 * * * ?"),每五分钟执行一次,代码如下
import org.springframework.batch.core.Job; import org.springframework.batch.core.JobParameters; import org.springframework.batch.core.JobParametersBuilder; import org.springframework.batch.core.launch.support.SimpleJobLauncher; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.scheduling.annotation.Scheduled; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; @RestController public class DemoController { @Autowired SimpleJobLauncher jobLauncher; @Autowired Job importJob; public JobParameters jobParameters; @Scheduled(cron = "0 0/5 * * * ?") @RequestMapping("/test") public void imp() throws Exception{ jobParameters = new JobParametersBuilder() .addLong("time",System.currentTimeMillis()) .toJobParameters(); jobLauncher.run(importJob,jobParameters); }
刚开始研究,如果有什么问题,欢迎大家共同讨论