javascript - Puppeter 将页面评估的返回值保存为 csv
问题描述
我目前有一个 puppeteer 脚本,它通过获取页面上所有 href 的数组、遍历每个 href 链接,然后从每个相应的 href 链接获取数据来从网站获取信息。
使用 page.evaluate 函数后,我能够通过 return 返回所有所需的值。
如何将返回值写入 csv?
这是我的脚本:
const stealth = require('puppeteer-extra-plugin-stealth')();
const hrefsCategoriesDeduped = new Set(await page.evaluate(
() => Array.from(
document.querySelectorAll('.b-shared-linked-pharmacy__box a[href].b-shared-linked-pharmacy__details'),
a => a.href
)
));
let pages = [];
console.log (hrefsCategoriesDeduped)
for (const url of hrefsCategoriesDeduped) {
await page.goto(url);
// await page.waitFor(10000)
let telusData = {
name: "",
address: "",
city: "",
province: "",
postal: "",
fax: "",
pharmacyphone: "",
pharmacyfax: "",
pharmacyemail: "",
pharmacywebsite: ""
};
await page.waitForSelector('.b-pharmacy-detail__details');
//let pharmacycomplete = []
telusData = await page.evaluate(() => {
let pharmacyname = document.querySelector('h2[class="b-pharmacy-detail__name"]').innerText
let pharmacystreet = Array.from(document.querySelectorAll(".b-pharmacy-detail__info"), element => element.textContent)
let pharmacyphoneandfax = Array.from(document.querySelectorAll(".b-pharmacy-detail__contact"), element => element.textContent)
var variable_names = {};
for(var i = 0; i< pharmacystreet.length; i++){
variable_names['na_'+i] = pharmacystreet[i];
}
return {
name: pharmacyname,
address: pharmacyaddress,
city1: city[1],
province: province[1],
postal: postal[1],
pharmacyphone: pharmacyphone,
pharmacyfax: pharmacyfax,
pharmacyemail: pharmacyemail,
pharmacywebsite: pharmacywebsite,
}
});
console.log(telusData)
}
await browser.close();
} catch (err) {
console.error(err);
}
})();
现在,我使用 csv 的所有标头声明变量 TelusData,然后进行页面评估以接收标头的值和相应的值。我回来 :
return {
name: pharmacyname,
address: pharmacyaddress,
city1: city[1],
province: province[1],
postal: postal[1],
pharmacyphone: pharmacyphone,
pharmacyfax: pharmacyfax,
pharmacyemail: pharmacyemail,
pharmacywebsite: pharmacywebsite,
}
对于每个“telusdata”,我如何将数据的值也保存到带有标题的 csv 中?非常感谢任何帮助。
解决方案
我已经减少了你的代码,因为有很多与你的具体问题无关。希望这可以为您指明正确的方向。
(async function main() {
try {
// create array to store all the data
let telusDataArray = [];
for (const url of hrefsCategoriesDeduped) {
telusData = await page.evaluate(() => {
return {
name: pharmacyname,
address: pharmacyaddress,
city1: city[1],
province: province[1],
postal: postal[1],
pharmacyphone: pharmacyphone,
pharmacyfax: pharmacyfax,
pharmacyemail: pharmacyemail,
pharmacywebsite: pharmacywebsite,
}
});
telusDataArray.push(telusData); // add each url's data to the array
}
// at this point telusDataArray should be filled and ready for CSV generation
await browser.close();
}
})();
推荐阅读
- python-3.x - pandas 按天、周或月分组
- 3d - 如何将实例的变换矩阵传递给顶点着色器输入?
- android - 为类体中声明的属性实现 kotlin 复制功能
- c# - 如何部署 Winforms 实体框架代码优先方法
- javascript - 如何在输入元素内制作我的标签
- canvas - 如何将 VectorSource 转换为 RasterSource
- linux - Linux Pipe viewer,如何拆分管道
- reactjs - Office Fabric React - snapToStep 不适用于滑块
- javascript - Javascript拖放元素
- lua - 零检查,最好的方法?