首页 > 解决方案 > 如何不使用元素的标签或 ID 从源代码中通过 VBA 获取值

问题描述

我正在尝试使用 VBA 进行网络抓取(https://www.booking.com/hotel/in/dream-catcher-home-stay.en-gb.html)并获取“b_hotel_id”行中的数值下面的代码

<td class="line-number" value="568"></td>
<td class="line-content">b_hotel_id: '554615',</td>

但我不知道如何引用它,因为没有 ID 或 TAG。此数据对网站隐藏,仅在源代码中可见。

我试图使用这个 VBA 代码获取数据:

Public Sub GetValueFromBrowser()
    Dim ie As Object
    Dim url As String
    Dim bkid As String

    url = "https://www.booking.com/hotel/in/dream-catcher-home-stay.en-gb.html"
    Set ie = CreateObject("InternetExplorer.Application")

    With ie
      .Visible = 0
      .navigate url
       While .Busy Or .readyState <> 4
         DoEvents
       Wend
    End With

    Dim Doc As HTMLDocument
    Set Doc = ie.document

    bkid = Trim(Doc.getElementsByName("b_hotel_id:")(0).Value)
    Range("A1").Value = myPoints

End Sub

能否请你帮忙?

标签: htmlexcelvba

解决方案


实际上我找不到 HTML 代码

<td class="line-number" value="568"></td>
<td class="line-content">b_hotel_id: '554615',</td>

在那个网址

url = "https://www.booking.com/hotel/in/dream-catcher-home-stay.en-gb.html"

但您可以提取如下数字:

Dim Doc As HTMLDocument
Set Doc = ie.document

Const find_string As String = "b_hotel_id: '"
Dim find_in As String
find_in = Doc.head.innerText

Dim id_start As Long
id_start = InStr(1, find_in, find_string) + Len(find_string)

Dim id_end As Long
id_end = InStr(id_start, find_in, "'")

MsgBox Mid$(find_in, id_start, id_end - id_start)

推荐阅读