在实现文件上传的过程发现由于文件太大会导致程序接收和发送都出现问题。所以想着实现一个分片发送和写入的方法方便实现大文件上传。
查了一些资料。在随机文件流操作时RandomAccessFile 恰好可以满足我的要求。这里主要是如何用RandomAccessFile来实现文件分批读取和写入,具体RandomAccessFile 的功能就不做介绍了。
首先,先定义一个分片实体,用于http发送:
@Data
@Schema(description = "文件分片信息")
public class FilePartVo {
@JsonFormat(shape = JsonFormat.Shape.STRING)
@NotNull(message = "文件id不能为空")
@Schema(description = "文件分片id,同一个文件的各个分片id相同")
private Long id;
@NotNull(message = "当前上传part不能为空")
@Schema(description = "当前上传part, 从1开始,所有文件按顺序")
private Integer currentPart;
@NotNull(message = "文件段所属文件的开始位置不能为空")
@Schema(description = "文件段所属文件的开始位置")
private Long starLocation;
@NotNull(message = "总的part数量不能为空")
@Schema(description = "文件总的part数量")
private Long totalPart;
@NotNull(message = "当前part大小不能为空")
@Schema(description = "当前part大小,单位byte")
private Long currentSize;
@NotNull(message = "当文件大小不能为空")
@Schema(description = "文件大小,单位byte")
private Long totalSize;
@NotEmpty(message = "文件当前分片信息不能为空")
@Schema(description = "文件当前分片信息,base64串")
private String content;
@NotEmpty(message = "文件名称不能为空")
@Schema(description = "文件名称")
private String fileName;
}
RandomAccessFile 分片读取文件:
通过指定分片大小,使用RandomAccessFile逐一读取。并将读取的byte转为base64字符串,然后将字符串和相关数据发送。实现如下。
RestTemplate restTemplate = new RestTemplate();
RandomAccessFile randomAccessFile = null;
try {
randomAccessFile = new RandomAccessFile("D:\\code\\test\\output\\test_1.0.0_202306300423.tar.gz", "r");
long totalSize = randomAccessFile.length();
int partSize = 1024 * 1024 * 10; //按照10M来分片
long totalPart = totalSize / partSize == 0 ? 1 : (totalSize % partSize == 0 ? totalSize / partSize : totalSize / partSize + 1);
byte[] buffer = new byte[partSize];
long startLocation = 0;
int currentPart = 1;
int readLength = 0;
Long id = SnowflakeIdWorkerUtil.generateId();
while((readLength = randomAccessFile.read(buffer)) != -1){
FilePartVo filePartVo = new FilePartVo();
filePartVo.setId(id);
if(readLength < partSize){
//最后一次可能不满足整个buffer都有数据
byte[] newBuffer = new byte[readLength];
System.arraycopy(buffer, 0, newBuffer, 0, readLength);
filePartVo.setContent(Base64.getEncoder().encodeToString(newBuffer));
} else {
filePartVo.setContent(Base64.getEncoder().encodeToString(buffer));
}
filePartVo.setCurrentSize(Long.valueOf(String.valueOf(readLength)));
filePartVo.setFileName("test_1.0.0_202306300423.tar.gz");
filePartVo.setCurrentPart(currentPart);
filePartVo.setTotalPart(totalPart);
filePartVo.setStarLocation(startLocation);
filePartVo.setTotalSize(totalSize);
BaseResult baseResult = restTemplate.postForObject("http://localhost:10002/core-server/file/partUpload", filePartVo, BaseResult.class);
LOGGER.info("分片文件上传: id:{}, part:{},size:{},name:{}, startLocation:{}",
filePartVo.getId(),filePartVo.getCurrentPart(), filePartVo.getCurrentSize(), filePartVo.getFileName(), filePartVo.getStarLocation());
BaseResultHandler.dealBaseResult(baseResult, "组件分片上传", null);
currentPart ++;
startLocation += readLength;
}
} catch (IOException e) {
LOGGER.error("文件内容写入失败", e);
} finally {
if(randomAccessFile != null){
try {
randomAccessFile.close();
} catch (IOException e) {
LOGGER.error("文件关闭异常:{}", e);
}
}
}
RandomAccessFile 分片写入文件数据:
RandomAccessFile的写入为了兼顾多文件上传,所以使用线程池+锁的方式来实现多文件同时写入。
文件写入对象:
@Data
@AllArgsConstructor
public class FileWriteDto {
//分段的开始位置
private long start;
//分段的内容
private String content;
}
先创建文件锁和数据对象,对于同一个文件的不同分片来说。只有一个对象。写同一个文件的各个线程共用一个锁 readWriteLock。fileWriteBlockingDeque用于记录同一个文件的不通分片信息
public class FileLockDto {
private String filePath;
private Long id;
/**
* 读写锁,每一个文件只有一个锁
*/
private ReentrantReadWriteLock readWriteLock = new ReentrantReadWriteLock();
/**
* 队列,用于存储同一个文件的不通分片信息
*/
private LinkedBlockingDeque fileWriteBlockingDeque = new LinkedBlockingDeque<>(200);
public FileLockDto(Long id, String filePath) {
this.id = id;
this.filePath = filePath;
}
public boolean addData(long start, String content){
FileWriteDto fileWriteDto = new FileWriteDto(start, content);
try {
fileWriteBlockingDeque.putLast(fileWriteDto);
LOGGER.info("加入分段片:file:{}, start:{}, 未写入:{}", filePath, start, fileWriteBlockingDeque.size());
return true;
} catch (InterruptedException e) {
LOGGER.error("数据插入失败:{}", e);
}
return false;
}
public FileWriteDto getFileWriteDto(){
FileWriteDto fileWriteDto = fileWriteBlockingDeque.pollFirst();
LOGGER.info("文件内容写入, 剩余:{}", fileWriteBlockingDeque.size());
return fileWriteDto;
}
public ReentrantReadWriteLock getReadWriteLock() {
return readWriteLock;
}
public LinkedBlockingDeque getFileWriteBlockingDeque() {
return fileWriteBlockingDeque;
}
public String getFilePath() {
return filePath;
}
文件调用工具类:
public class FilePartHandle {
//记录正在写文件的记录
private static Map writingFile = new HashMap<>();
private static Map writingDate = new HashMap<>();
//线程池
private static ThreadPoolExecutor threadPoolExecutor =
new ThreadPoolExecutor(5, 10, 10, TimeUnit.SECONDS, new LinkedBlockingDeque<>(100));
private static final Object lock = new Object();
private static synchronized boolean createFile(String filePath, Long id, Long totalSize){
File file = new File(filePath);
if(!file.exists()){
LOGGER.info("创建文件:{}", filePath);
try {
return file.createNewFile();
} catch (IOException e) {
LOGGER.error("文件创建失败", e);
}
} else {
return true;
}
writingDate.put(id, new Date());
return true;
}
public static BaseResult writePart(FilePartVo filePartVo, String filePath){
if(StringUtils.isEmpty(filePartVo.getContent())){
LOGGER.error("文件内容为空:{}", JsonUtil.toJson(filePartVo));
return BaseResult.success();
}
if(!createFile(filePath, filePartVo.getId(), filePartVo.getTotalSize())){
return BaseResult.fail("文件创建失败: "+ filePath);
}
synchronized (lock){
if(!writingFile.containsKey(filePartVo.getId())){
FileWriteLockDto fileWriteLockDto = new FileWriteLockDto(filePartVo.getId(), filePath);
writingFile.put(filePartVo.getId(), fileWriteLockDto);
}
}
//获取对象并将分片信息加入队列
FileLockDto fileWriteLockDto = writingFile.get(filePartVo.getId());
fileLockDto.addData(filePartVo.getStarLocation(), filePartVo.getContent());
FileWriteRunnable fileWriteRunnable = new FileWriteRunnable(fileWriteLockDto);
threadPoolExecutor.execute(fileWriteRunnable);
return BaseResult.success("开始i写入文件信息");
}
public static void finishWrite(Long id){
writingFile.remove(id);
}
public static void deleteFile(String filePath){
FileUtil.deleteFileByPath(filePath);
}
public static int cleanExpireFile(){
long now = System.currentTimeMillis();
for(Long key : writingDate.keySet()){
if(now - writingDate.get(key).getTime() > 30 * 60 * 1000){
writingDate.remove(key);
writingFile.remove(key);
}
}
return writingDate.size();
}
}
线程调用具体的写入方法写入数据:
public class FileWriteRunnable implements Runnable {
private FileLockDto fileLockDto;
public FileWriteRunnable(FileLockDto fileLockDto) {
this.fileLockDto= fileLockDto;
}
@Override
public void run() {
FileWriteService fileWriteService = new FileWriteService(fileLockDto.getFilePath(),
fileLockDto.getReadWriteLock(), fileLockDto.getFileWriteBlockingDeque());
fileWriteService.startWrite();
}
}
FileWriteService通过从FileLockDto的队列中获取分片数据并在指定的地方插入。通过锁保证同一个文件同一个时间只有一个线程在操作。写入线程循环取出线程池中的数据不停的写入。
public class FileWriteService {
private String filePath;
private LinkedBlockingDeque fileWriteBlockingDeque;
private ReentrantReadWriteLock readWriteLock;
public FileWriteService(String filePath, ReentrantReadWriteLock readWriteLock, LinkedBlockingDeque fileWriteBlockingDeque) {
this.filePath = filePath;
this.readWriteLock = readWriteLock;
this.fileWriteBlockingDeque = fileWriteBlockingDeque;
}
public void startWrite() {
if (StringUtils.isEmpty(filePath)) {
LOGGER.error("文件信息不存在");
return;
}
try {
boolean lockResult = readWriteLock.writeLock().tryLock(10, TimeUnit.SECONDS);
if (!lockResult) {
return;
}
//开始写入
RandomAccessFile randomAccessFile = null;
try {
randomAccessFile = new RandomAccessFile(new File(filePath), "rw");
while (fileWriteBlockingDeque.size() > 0) {
FileWriteDto fileWriteDto = fileWriteBlockingDeque.pollFirst();
if (fileWriteDto != null) {
randomAccessFile.seek(fileWriteDto.getStart());
byte[] content = Base64.getDecoder().decode(fileWriteDto.getContent());
LOGGER.info("文件内容写入:start:{}, size:{}", fileWriteDto.getStart(), content.length);
randomAccessFile.write(content);
}
LOGGER.info("文件内容剩余:{}", fileWriteBlockingDeque.size());
}
} catch (IOException e) {
LOGGER.error("文件内容写入失败", e);
} finally {
if (randomAccessFile != null) {
try {
randomAccessFile.close();
} catch (IOException e) {
LOGGER.error("文件关闭失败", e);
}
}
readWriteLock.writeLock().unlock();
}
} catch (InterruptedException e) {
LOGGER.error("文件内容写入线程异常", e);
}
}
}
这里主要是通过两个锁,第一个锁保证每个文件只有一个队列和文件锁对象。有数据分片加入时,加入该文件已有的对象中。第二个锁保证统一文件只有一个线程在操作。这样实现不通文件多个线程写入。同一个文件的写入线程公用一个锁。
至此完成了一个文件分片读取和写入。当然这里也有一些问题。需要清理FilePartHandle.writingFile中已完成的对象。所以需要一个Timer对象定时清理:
private static boolean isStarting = false;
private static TimerTask timerTask = new TimerTask() {
@Override
public void run() {
int size = FilePartHandle.cleanExpireFile();
if(size == 0){
LOGGER.info("关闭文件合并定时检查任务");
timerTask.cancel();
}
}
};
public synchronized static void startTimer(){
if(!isStarting){
LOGGER.info("开启文件合并定时检查任务");
Timer timer = new Timer();
timer.scheduleAtFixedRate(timerTask, 1000 * 60, 1000 * 60);
isStarting = true;
}
}
其实这里我还用了redis存储每一个分片的分片号和大小等,用于校验文件是否完整,大小是否一致。不是很重要,这里就不贴出来了。
有一个问题:这里的存储文件路径是同一个,所以不能在同一时间多次上传同一个文件,否则会有问题。这个等后续解决吧。