写在前面:在做这个调研时我遇到的需求是前端直接对接亚马逊平台实现文件上传功能。上传视频文件通常十几个G、客户工作环境网络较差KB/s,且保证上传是稳定的,支持网络异常断点重试、文件断开支持二次拖入自动重传等。综合考虑使用的Aws S3的分段上传功能,基于分段的特性在应用层面上实现断点、断网重传功能。
本文主要参考亚马逊中文博客:
客户端直连S3实现断点续传思路与实践
AWS api英文官方文档
Class: AWS.S3 — AWS SDK for JavaScript
首先npm安装aws-sdk
npm i aws-sdk
使用的时候通过import aws from "aws-sdk"全局导入
import aws from "aws-sdk";
const secret = await getSecret();//调用后端api获取上传配置信息
//对接asw需要的配置信息
const config = {
accessKeyId: secret.ak,
secretAccessKey: secret.sk,
region: secret.regions,
};
//配置秘钥信息
aws.config.update(config);
const awsClient = new aws.S3();
Bucket: 后端给的,相当于文件夹名称,
Key: 通常是上传文件的name,file.name,
Body: 文件本身,file
使用createMultipartUpload方法创建一个多段上传任务,并使用uploadPart方法上传单个分片。所有分片上传后手动调completeMultipartUpload方法校验上传任务是否全部完成。
createMultipartUpload创建分段上传任务该方法返回本次上传ID,示例
//初始化分片上传
async function initMultiPartUpload(awsClient: any, params: any) {
const result = await awsClient.createMultipartUpload(params).promise();
return result.UploadId;
}
在调用uplodPart方法前需要手动对文件进行切片,示例
const PartSize = 10 * 1024 * 1024; // 10 MB
async function awsUploadPart(
fileState: FileState,
file: File,
uploadId: string,
key: string,
awsClient: any
) {
const count = Math.ceil(file.size / PartSize);
const uploadPromises = [];
for (let n = 1; n <= count; n++) {
const start = (n - 1) * PartSize;
const end = Math.min(start + PartSize, file.size) - 1;
if (!partNumbers.includes(n)) {
const uploadPromise = awsClient
.uploadPart({
Bucket,
Key: key,
UploadId: uploadId,
PartNumber: n,
Body: file.slice(start, end + 1),
})
.promise()
.then((data: any) => {
completeParts.push({ PartNumber: n, ETag: data.ETag });
fileState.percent = parseInt((completeParts.length * 100) / count);
})
.catch((err: any) => {
fileState.status = FileStatus.fail;
throw err;
});
uploadPromises.push(uploadPromise);
}
}
//所有分片上传完成后手动合并
Promise.all(uploadPromises)
.then(() => {
checkMultiPart(uploadId, completeParts, fileState, key, awsClient);
})
.catch(() => {
fileState.status = FileStatus.fail;
});
}
由于promise异步执行,分片派发的顺序和完成的顺序可能不一致,而completeMultipartUpload方法接收的已上传分片信息的PartNumber必须是按序排列的,因此用排序好的newParts
//aws合并分片校验
function completeMultiUpload(
uploadId: any,
parts: any,
fileState: FileState,
key: any,
awsClient: any,
sk: any
) {
//分片信息按PartNumber顺序排序
const newParts = parts.sort((a: any, b: any) => a.PartNumber - b.PartNumber);
awsClient
.completeMultipartUpload({
Bucket: sk.bucketName,
UploadId: uploadId,
MultipartUpload: {
Parts: newParts,
},
Key: key,
})
.promise()
.then(() => {
fileState.status = FileStatus.success;
fileState.percent = 100;
})
.catch(() => {
fileState.status = FileStatus.fail;
});
}
Aws s3不支持断点续传。需要在应用层面进行相应的处理来实现这个功能。在上传开始时,根据文件名,key值,手动调aws获取历史上传分片信息,使用listMultipartUploads和listParts方法获取已上传的部分,并使用分段上传方法继续上传剩余的部分。
//获取当前文件是否有已上传断点信息
async function getAwsCheckpoint(
key: string,
awsClient: any,
sk: any
): Promise {
let uploadId = "";
let partsInfo;
try {
const result = await awsClient
.listMultipartUploads({
Bucket: sk.bucketName,
Prefix: key,
})
.promise();
if (result.Uploads.length) {
uploadId = result.Uploads[result.Uploads.length - 1].UploadId; //获取具体分片信息
partsInfo = await awsClient
.listParts({
Bucket: sk.bucketName,
Key: key,
UploadId: uploadId,
})
.promise();
}
} catch (err: any) {
console.log(err);
}
return { uploadId, partsInfo };
}
思考:如果文件已经全部上传是不是不用调listMultipartUploads和listParts获取分片上传的信息了?
使用aws提供的headObject方法,先校验文件是否上传,未上传headObject方法会抛出错误;反之文件已传完。这里的逻辑是文件上传完成后在桶Bucket文件夹下会显示文件,未上传或缺失上传分片的将找不到。
因此整个逻辑梳理如下:
headObject判断文件是否已存在,已存在,上传进度100%
文件部分上传--》过滤出已上传的分片,这里同个文件多次上传取最近一次的上传记录。将未上传的分片信息上传
从未上传--》初始化分片信息--》分段上传
async function awsRequest(
fileState: FileState,
file: any,
key: string,
) {
const secret = await getSecret();
const config = {
accessKeyId: secret.ak,
secretAccessKey: secret.sk,
region: secret.regions,
};
//配置秘钥信息
aws.config.update(config);
const awsClient = new aws.S3();
const params = {
Bucket: secret.bucketName,
Key: key,
};
try {
//检查文件是否已上传
awsClient.headObject(params, async (err: any, data: any) => {
// 没有上传成功,head方法会返回失败
if (err) {
//检查是否部分上传
const { uploadId, partsInfo } = await getAwsCheckpoint(
key,
awsClient,
secret
);
if (uploadId) {
//断点续传
awsUploadPart(
fileState,
file,
uploadId,
partsInfo.Parts,
key,
awsClient,
secret
);
} else {
//初始化文件上传
const uploadId = await initMultiPartUpload(awsClient, params);
awsUploadPart(
fileState,
file,
uploadId,
[],
key,
awsClient,
secret
);
}
} else {
//data存在,上传成功
fileState.percent = 100;
fileState.status = FileStatus.success;
}
});
} catch (err: any) {
console.log(err);
}
}
扩展分段上传方法:传入已上传的分片信息
async function awsUploadPart(
fileState: FileState,
file: File,
uploadId: string,
parts: any,
key: string,
awsClient: any
) {
//已完成的分片
const completeParts = parts.map((_: any) => {
return { PartNumber: _.PartNumber, ETag: _.ETag };
});
const count = Math.ceil(file.size / PartSize);
const partNumbers = parts.map((_: any) => _.PartNumber);
if (partNumbers.length) {
fileState.status = FileStatus.processing;
fileState.percent = parseInt((completeParts.length * 100) / count);
}
const uploadPromises = [];
for (let n = 1; n <= count; n++) {
if (!startTime) {
startTime = new Date();
}
const start = (n - 1) * PartSize;
const end = Math.min(start + PartSize, file.size) - 1;
if (!partNumbers.includes(n)) {
const uploadPromise = awsClient
.uploadPart({
Bucket,
Key: key,
UploadId: uploadId,
PartNumber: n,
Body: file.slice(start, end + 1),
})
.promise()
.then((data: any) => {
completeParts.push({ PartNumber: n, ETag: data.ETag });
fileState.percent = parseInt((completeParts.length * 100) / count);
})
.catch((err: any) => {
fileState.status = FileStatus.fail;
throw err;
});
uploadPromises.push(uploadPromise);
}
}
//所有分片上传完成后手动合并
Promise.all(uploadPromises)
.then(() => {
checkMultiPart(uploadId, completeParts, fileState, key, awsClient);
})
.catch(() => {
fileState.status = FileStatus.fail;
});
}
awsRequest方法对文件做了断点续传处理,如何在断网后恢复重传呢
可以用定时器监听网络状态,同时监听FileStatus是fail失败的文件进行手动重传。
使用window.navigator.onLine获取网络状态,使用定时器定时执行,
//添加定时器
const startInterval = () => {
timer.value = setInterval(() => {
if (window.navigator.onLine) {
//有网络时主动检测是否有失败的
if (failFlag) {
startTime = null;
await awsRequest(fileState, file, key);
}
}
}, 10 * 1000); // 5秒钟,单位为毫秒
};
本文仅对业务逻辑中的部分代码抽离讲解了aws分段上传、重传、重试的方法、具体使用请结合自身的场景进行扩展