首页 > 解决方案 > 无法从 span itemprop 中提取数据

问题描述

我有以下内容可以从网页中提取一些价格和可用性。但是我在以下位置获得了所需的对象:

设置价格 = ie.Document.querySelector(".price-cont .final-price" )

为什么?

Sub getMetaDataInfo()
Dim ie As New InternetExplorer
Dim mylink As String
Dim wb As Workbook: Set wb = ThisWorkbook
Dim wks As Worksheet
Dim lastrow As Integer
Set wks = wb.Sheets("Info")
Dim i As Integer
lastrow = wks.Cells(Rows.Count, "B").End(xlUp).Row

For i = 2 To lastrow

mylink = wks.Cells([i], 2).Value   

ie.Visible = False
ie.Navigate mylink

Do
DoEvents
Loop Until ie.ReadyState = READYSTATE_COMPLETE

Dim price As Object, availability As Object

Set price = ie.Document.querySelector(".price-cont .price")
wks.Cells(i, "C").Value = price.innerText   

Set availability = ie.Document.querySelector(".inner-box-one .availability")
wks.Cells(i, "D").Value = availability.innerText   

Next i

End Sub

我试图像下面这样插入延迟

Sub getMetaDataInfo()
Dim IE As New InternetExplorer

Dim mylink As String
Dim wb As Workbook: Set wb = ThisWorkbook
Dim wks As Worksheet
Dim lastrow As Integer
Set wks = wb.Sheets("Info")
Dim i As Integer

lastrow = wks.Cells(Rows.Count, "B").End(xlUp).Row

IE.Visible = True


For i = 2 To lastrow

mylink = wks.Cells(i, 2).Value

IE.Visible = False
IE.Navigate mylink


Dim price As Object, t As Date
Const MAX_WAIT_SEC As Long = 5

Dim price As Object, availability As Object

While IE.Busy Or IE.ReadyState < 4: DoEvents: Wend
t = Timer
Do
    DoEvents
    On Error Resume Next

    Set price = IE.Document.querySelector(".price-cont .final-price")
    wks.Cells(i, "C").Value = price.innerText

    If Timer - t > MAX_WAIT_SEC Then Exit Do
    On Error GoTo 0
Loop
If price Is Nothing Then Exit Sub


Next i

End Sub

我的情况是我首先手动登录网页我保持 IE 窗口打开我去 excel 运行宏但是..

标签: htmlexcelvbaweb-scraping

解决方案


没有看到 HTML/URL 很难说。您是否验证了选择器是否正确?

否则,您现在可以做的主要两件事与允许页面加载足够的时间有关,是:

1)在尝试选择之前添加适当的等待

While ie.Busy Or ie.readyState < 4: DoEvents: Wend

2)尝试定时循环以允许更多的加载时间

Option Explicit
Public Sub LoopUntilSet()
    Dim price As Object, t As Date
    Const MAX_WAIT_SEC As Long = 5

    'your other code

    While ie.Busy Or ie.readyState < 4: DoEvents: Wend
    t = Timer
    Do
        DoEvents
        On Error Resume Next
        Set price = ie.document.querySelector(".price-cont .price")
        If Timer - t > MAX_WAIT_SEC Then Exit Do
        On Error GoTo 0
    Loop
    If price Is Nothing Then Exit Sub

    'other code.....
End Sub

3)从周围删除[]i


推荐阅读