首页 > 解决方案 > 如何使用 VBA 从嵌套的 Div 中抓取

问题描述

来自网站的 HTML 代码

大家好,

我正在尝试编写一个网络抓取工具来抓取信息并最终下载网站上托管的文件(尽管我认为我不会在这里解决这个问题。它是 BIM360,一个托管文件和与文件相关的信息的 Autodesk 网站。这里是我到目前为止所拥有的:它打开了网站,但随后我遇到了错误 924。我认为这是因为根据图片有嵌套的 div。我已经突出显示了我最初在图片中要抓取的内容,然后我希望它循环页面并对每个文档执行此操作。提前感谢您的帮助。

Sub GetIEValues()
Dim ie As InternetExplorer
Set ie = New InternetExplorerMedium
Dim dd As Variant

Set ie = CreateObject("InternetExplorer.Application")

With ie
    .Visible = True
    .navigate "https://docs.b360.eu.autodesk.com/projects/f0c33551-e503-4f82-afe8-de994c61b880/reviews"

    Application.Wait (Now + TimeValue("0:00:016"))

    Do
        DoEvents
        Loop Until ie.readyState = READYSTATE_COMPLETE


    With ThisWorkbook.Worksheets(1)
        lRow = .Cells(.Rows.Count, "B").End(xlUp).Row + 1
        .Cells(lRow, "A").Value = Now()
        .Cells(lRow, "B").Value = .document.getElementsByClassName("EllipsisText MatrixTable__row-cell-text")(0).innerText
        .Cells(lRow, "C").Value = .document.getElementsByClassName("EllipsisText MatrixTable__row-cell-text")(1).innerText
        .Cells(lRow, "D").Value = .document.getElementsByClassName("EllipsisText MatrixTable__row-cell-text")(2).innerText
        .Cells(lRow, "E").Value = .document.getElementsByClassName("EllipsisText MatrixTable__row-cell-text")(3).innerText
        .Cells(lRow, "F").Value = .document.getElementsByClassName("EllipsisText MatrixTable__row-cell-text")(4).innerText
    End With
     .Quit
End With

结束子

标签: htmlvbaweb-scrapingscreen-scraping

解决方案


推荐阅读