基于JPA的批量增加数据引发的几种方式

这里以mysql数据库为例,提供批量插入数据的高效方式,并做一定的对比,模拟10万条数据

一、环境配置

Application.yml文件配置

server:
  port: 8086
spring:
  application:
    name: batch
  jpa:
    database: mysql
    show-sql: true
    properties:
      hibernate:
        dialect: org.hibernate.dialect.MySQL5InnoDBDialect
        generate_statistics: true
        jdbc:
          batch_size: 500
          batch_versioned_data: true
        order_inserts: true
        order_updates: true
  datasource:
    url: jdbc:mysql://localhost:3306/hr?rewriteBatchedStatements=true&serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8&useSSL=true&allowMultiQueries=true
    username: root
    password: ****
    driver-class-name: com.mysql.cj.jdbc.Driver

数据:

public void inits(){
        //这里模拟10万条数据
        for(int i=0;i<100000;i++){
            User user = new User();
            user.setAge(i);
            user.setId(i+"");
            user.setName("name"+i);
            userList.add(user);
        }
    }

二、几种批量插入方式的比较

(1) JPA的SaveAll方法

传统jpa的saveAll方法还是太慢了,加了配置依旧很慢,10万数据需要23s的时间

(2)使用EntityManager的persist方法

    @PersistenceContext
    private EntityManager em;

    private static final int BATCH_SIZE = 10000;
 /**
     * 批量增加,需要配置,10万条数据的消耗是 2549 ms
     * @param list
     */
    @Transactional(rollbackFor = Exception.class)
    public void batchInsertWithEntityManager(List list){
        Iterator iterator = list.listIterator();
        int index = 0;
        while (iterator.hasNext()){
            em.persist(iterator.next());
            index++;
            if (index % BATCH_SIZE == 0){
                em.flush();
                em.clear();
            }
        }
        if (index % BATCH_SIZE != 0){
            em.flush();
            em.clear();
        }
    }

效果:

 @Test
    public void testBatchInsert(){
        long saveStart = System.currentTimeMillis();
        batchDao.batchInsertWithEntityManager(userList);
        long saveEnd = System.currentTimeMillis();
        System.out.println("the save total time is "+(saveEnd-saveStart)+" ms"); //the save total time is 2549 ms
        userDao.deleteAllInBatch(); //一条语句,批量删除
    }

(3)使用jdbcTemplate的batchUpdate方法

    @Autowired
    private JdbcTemplate jdbcTemplate;
   /**
     * jdbcTemplate,batchUpdate增加,需自己定义sql,需要配置
     * @param list
     */
    public void batchWithJDBCTemplate(List list){
        String sql = "Insert into t_user(id,name,age) values(?,?,?)";
        jdbcTemplate.batchUpdate(sql,new BatchPreparedStatementSetter() {
            @Override
            public void setValues(PreparedStatement ps, int i) throws SQLException {
                ps.setString(1,list.get(i).getId());
                ps.setString(2,list.get(i).getName());
                ps.setInt(3,list.get(i).getAge());
            }
            @Override
            public int getBatchSize() {
                return list.size();
            }
        });
    }

效果:

 @Test
    public void testBatchWithJDBC(){
        long saveStart = System.currentTimeMillis();
        batchDao.batchWithJDBCTemplate(userList);
        long saveEnd = System.currentTimeMillis();
        System.out.println("the save total time is "+(saveEnd-saveStart)+" ms"); // the save total time is 1078 ms
        userDao.deleteAllInBatch(); //一条语句,批量删除
    }

(4)原生sql的方式

/**
     * 使用数据库原生的方式执行,不需要配置
     * @param list
     */
    public void batchWithNativeSql(List list) throws SQLException {
        String sql = "Insert into t_user(id,name,age) values(?,?,?)";
        DataSource dataSource = jdbcTemplate.getDataSource();
        try{
            Connection connection = dataSource.getConnection();
            connection.setAutoCommit(false);
            PreparedStatement ps = connection.prepareStatement(sql);
            final int batchSize = 10000;
            int count = 0;
            for(User user :list){
                ps.setString(1,user.getId());
                ps.setString(2,user.getName());
                ps.setInt(3,user.getAge());
                ps.addBatch();
                count++;
                if(count % batchSize == 0 || count == list.size()) {
                    ps.executeBatch();
                    ps.clearBatch();
                }
            }
            connection.commit();
        }catch (SQLException e){
            e.printStackTrace();
        }
    }

效果:

 @Test
    public void testBatchWithNativeSql() throws SQLException {
        long saveStart = System.currentTimeMillis();
        batchDao.batchWithNativeSql(userList);
        long saveEnd = System.currentTimeMillis();
        System.out.println("the save total time is "+(saveEnd-saveStart)+" ms"); // the save total time is 899 ms
        userDao.deleteAllInBatch();
    }

三、结论

JdbcTemplate的batchUpdate和原生SQL操作两种方式基本上能满足数据大的操作需求,前者需要在进行配置,而后者不需要配置。对于小批量的数据操作,根据自己的需要选择。

温馨提醒:配置失效或者没有效果,请重新检查一下配置文件,同时也检查表是否存在触发器,有触发器则需要与相关人员进行沟通,建议把触发器涉及的业务进行抽取成几张表的批量插入。

你可能感兴趣的:(JavaEE)