前边一篇博文提到,查询超大数据量,如何避免JVM出现OOM,分批查询处理解决思路
在企业系统中,很多地方可能都需要用到文件导出功能,对于刚入行的小伙伴,代码里面可能经常会出现这种SQL:
#查询全表
select * from table_name;
#查询很大(很多时候是全表)
select * from tab_name limit 0, 9999999;
每当博主发现线上机器OOM,在定位到代码的时候,很多地方都是因为导出,当我看到导出代码的时候,经常看到一个查询,limit后面加了一个很大的数字,看完后差点一口老血吐出来。
对于出现问题的原因,及解决思路基本上和我前面提到的思路差不多:查询超大数据量,如何避免JVM出现OOM,分批查询处理解决思路 这里就不再赘述。
对于文件导出我想让刚入手不久的朋友们思考下面几个问题:
针对上面的分析,下面给出基于 查询超大数据量,如何避免JVM出现OOM,分批查询处理解决思路一文的基础上,基于Mytabis-Plus给出导出的相应实现。同时上文出现过的BulkExecutorUtil这里也不再贴出。
/**
* @author caoyong
* @version 1.0.0
* @since 2022-02-28 10:24:08
* 定义基本参数
**/
@Data
@Builder
public class SegmentDownParam<T, V> {
/**
* default size
*/
public static final Integer DEFAULT_SIZE = 2000;
/**
* batch size
*/
@Builder.Default
private Integer batchSize = DEFAULT_SIZE;
/**
* query wrapper
*/
private Wrapper<T> queryWrapper;
/**
* query service
*/
private IService<T> service;
/**
* download show name
*/
@NonNull
private String showName;
/**
* transfer T(model that mybatis-plus generated) to V(VO that service response)
*/
@NonNull
private Function<List<T>, List<V>> transFunc;
/**
* write function
*/
@NonNull
BiConsumer<List<V>, ExcelWriter> writeFunc;
/**
* export VO class
*/
@NonNull
private Class<?> exportClass;
/**
* the class for service query response VO
*/
@NonNull
private Class<V> voClass;
/**
* count: if present,use it else use service.count()
*/
@Builder.Default
private Integer count = 0;
/**
* query function
*/
private Function<Integer, List<T>> queryFunc;
}
核心实现:
/**
* @author caoyong
* @version 1.0.0
* @since 2022-02-28 10:05:23
**/
@Slf4j
public class SegmentDownUtil {
/**
* generate download file
*
* @param param param need
* @param type for query form mybatis-plus
* @param type for transferred
* @return download file path
*/
public static <T, V> String generateFile(SegmentDownParam<T, V> param) {
{
//download file path
ExcelWriter excelWriter = null;
String downloadFilePath = AttachmentHelper.tempFilePatch() + Common.SPLIT + param.getShowName() + ".xlsx";
try {
excelWriter = EasyExcel.write(downloadFilePath, param.getExportClass()).build();
BiConsumer<List<V>, ExcelWriter> writeFunc = param.getWriteFunc();
//get total count
IService<T> service = param.getService();
//get available processors
int parallelism = Runtime.getRuntime().availableProcessors() * 2 + 1;
BulkExecutorParam<T> exeParam = BulkExecutorParam.<T>builder()
.queryWrapper(param.getQueryWrapper())
.batchSize(param.getBatchSize())
.count(param.getCount())
.queryFunc(param.getQueryFunc())
.service(service)
.parallelism(parallelism)
.build();
// batch write list json to local temp files
List<Future<File>> futures = BulkExecutorUtil.submit(exeParam, list -> {
File file = new File(AttachmentHelper.tempFilePatch() + IdWorker.getId());
List<V> transferredRows = param.getTransFunc().apply(list);
byte[] bytes = JSONArray.toJSONBytes(transferredRows, SerializerFeature.WriteDateUseDateFormat,
SerializerFeature.WriteBigDecimalAsPlain);
try (FileOutputStream outputStream = new FileOutputStream(file)) {
IOUtils.write(bytes, outputStream);
} catch (IOException e) {
log.error("write to temp json temp File error, path:{}", file.getAbsolutePath(), e);
}
return file;
});
if (CollectionUtil.isEmpty(futures)) {
writeFunc.accept(new ArrayList<>(), excelWriter);
return downloadFilePath;
}
//iterate all file and merge
for (Future<File> future : futures) {
try {
File file = future.get();
try (FileInputStream in = new FileInputStream(file)) {
String dataString = IOUtils.toString(in, StandardCharsets.UTF_8);
List<V> list = JSONArray.parseArray(dataString, param.getVoClass());
writeFunc.accept(list, excelWriter);
} catch (Exception e) {
log.error("merge temp file error, path:{}", file.getAbsolutePath(), e);
}
boolean isDeleted = file.delete();
if (!isDeleted) {
log.warn("delete temp json file error");
}
} catch (Exception e) {
log.error("get future File error:{}", e.getMessage(), e);
}
}
} finally {
if (excelWriter != null) {
excelWriter.finish();
}
}
return downloadFilePath;
}
}
}
使用示例
LambdaQueryWrapper<T> wrapper = Wrappers.lambdaQuery(T.class);
SegmentDownUtil.generateFile(SegmentDownParam
.<T, V>builder()
.showName("展示名")
.service(this)
.transFunc(this::fillRows)
.writeFunc(this::writeExcelFile)
//导出时使用的VO
.exportClass(ExportVo.class)
.queryWrapper(wrapper )
.voClass(V.class)
.build());
/**
* filled rows and transfer T -> V
*
* @param rows entity rows
* @return transferred rows and filled
*/
private List<V> fillRows(List<T> rows) {
List<V> filledRows = rows.stream().map(value -> {
//转换操作
return new V();
}).collect(Collectors.toList());
return filledRows;
}
/**
* write Excel file
*
* @param originalData origin data
* @param writer excel writer
*/
private void writeExcelFile(List<V> originalData, ExcelWriter writer) {
WriteSheet writeSheet = EasyExcel.writerSheet("导出示例").build();
List<ExportVo> exportList = originalData.parallelStream()
.map(vo -> {
ExportVo exportVo = new ExportVo();
BeanUtils.copyProperties(vo, exportVo);
exportVo.setOpTime(LocalDateTimeUtil.format(vo.getOpTime(), DateUtil.YYYY_MM_DD_HH_MM_SS));
return exportVo;
}).collect(Collectors.toList());
writer.write(exportList, writeSheet);
}