需求场景
获取网站点击的下载pdf,并把pdf重命名再上传到COS云上面
技术使用
“puppeteer”: “^19.7.2”,
“egg”: “^3.15.0”, // 服务期用egg搭的
文件服务使用COS腾讯云
核心思路
获取浏览器下载事件,并把文件保存到本地
const session = await substitutePage.target()
.createCDPSession();
await session.send('Page.setDownloadBehavior', {
behavior: 'allow',
downloadPath, // 指定文件保存路径回家
});
在保存到本地前监听此文件夹,如果有文件则获取并上传
加timer做防抖是为了防止在文件写入时以及重命名文件时多次触发watch函数,导致出会出现0KB源文件脏数据文章来源:https://www.toymoban.com/news/detail-851973.html
let timer: any = null;
fs.watch(downloadPath, async (_eventType, filename) => {
if (timer !== null) {
clearTimeout(timer);
}
timer = setTimeout(() => {
// 防止出现下载的临时文件就触发
if (filename.endsWith('.pdf')) {
resolve({
filename,
});
}
}, 500);
});
完整代码
const session = await substitutePage.target()
.createCDPSession();
await session.send('Page.setDownloadBehavior', {
behavior: 'allow',
downloadPath, // 指定文件保存路径回家
});
// res就是文件相关信息了
const [ res ] = await this.downloadPdfHandler(substitutePage, downloadPath);
// filePath就是自己本地的文件所在绝对路径
const filePath = `${downloadPath}/${res.fileName}`;
// uploadFile是cos文件上传相关实现, 我就不放了,有私密的key
const pdfUriCode = await this.uploadFile(filePath, filePath);
const pdfUri = decodeURIComponent(pdfUriCode);
this.domainList = {
pdfSize: res.pdfSize,
pdfUri: pdfUri.substring(pdfUri.indexOf('root')),
};
downloadPdfHandler函数实现文章来源地址https://www.toymoban.com/news/detail-851973.html
downloadPdfHandler(page, downloadPath): Promise<any> {
const uuidName = uuidv4();
const fsWatchApi = () => {
// 使用防抖是为了防止下载的文件没有写入完全就重命名,那样会产生一个脏文件
let timer: any = null;
return new Promise<{ filename: string }>(resolve => {
fs.watch(downloadPath, async (_eventType, filename) => {
if (timer !== null) {
clearTimeout(timer);
}
timer = setTimeout(() => {
// 防止出现下载的临时文件就触发
if (filename.endsWith('.pdf')) {
resolve({
filename,
});
}
}, 500);
});
});
};
function responseWatchApi() {
return new Promise<void>(resolve => {
page.on('response', async response => {
// 检查响应是否为application/octet-stream且可能包含PDF(或你期望的其他文件类型)
if (response.headers()['content-type'].startsWith('application/octet-stream')) {
resolve();
}
});
});
}
return new Promise(async (resolve, reject) => {
try {
const [ , { filename }] = await Promise.all([ responseWatchApi(), fsWatchApi() ]);
const oldFilePath = path.join(downloadPath, filename);
const newFilePath = path.join(downloadPath, `${uuidName}.pdf`);
try {
fs.renameSync(oldFilePath, newFilePath);
this.logger.info(`文件名已经被修改完成:${uuidName}`);
} catch (error) {
this.logger.info(`文件名已经被修改完成:${uuidName}`);
}
await this.sleep(5 * 1000);
const files = fs.readdirSync(downloadPath);
// 创建一个数组,将文件名和其mtime(最后修改时间)一起存储
const filesWithMtime = files.map(file => {
const filePath = path.join(downloadPath, file);
const stats = fs.statSync(filePath);
return { fileName: file, mtime: stats.mtime, size: stats.size };
});
const newestFile = filesWithMtime.sort((a, b) => b.mtime.getTime() - a.mtime.getTime())[0];
this.logger.info('newestFile: %o', {
newestFile,
});
resolve({
pdfSize: newestFile.size,
fileName: newestFile.fileName,
});
} catch (e) {
reject(e);
}
});
}
到了这里,关于使用puppeteer完成监听浏览器下载文件并保存到自己本地或服务器上完成上传功能的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!