首页 > 解决方案 > 如何通过 puppeteer 进行网页抓取

问题描述

如果我去控制台https://investor.vanguard.com/mutual-funds/profile/VMMXXdocument.querySelector("[data-ng-if='productSummaryTitle']").innerText从控制台执行,我会得到我所期望的:Product summary.

但是当我尝试对 做同样的事情时puppeteer,我得到了UnhandledPromiseRejectionWarning: Error: Evaluation failed: TypeError: Cannot read property 'innerText' of null at __puppeteer_evaluation_script__:3:83. 我错过了什么?

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch({ headless: false })
    const page = await browser.newPage()
    await page.goto('https://investor.vanguard.com/mutual-funds/profile/VMMXX')

    const result = await page.evaluate(() => {
        let myText = document.querySelector("[data-ng-if='productSummaryTitle']").innerText
        return {
            myText
        }
    })

    console.log(result)

    browser.close()
})()

标签: javascriptnode.jspuppeteer

解决方案


你可以先等待那个选择器

const element = await page.waitForSelector('[data-ng-if='productSummaryTitle']');
const text = await element.evaluate(el => el.innerText);

推荐阅读