首页 > 解决方案 > 循环遍历列表 LI 标记中的元素

问题描述

我正在尝试学习抓取,我得到了一个 html 部分,我需要遍历列表的元素并从每个标题中获取信息。html中有7个元素

<ul class="a-unordered-list a-nostyle a-vertical"><li><span class="a-list-item"><span class="a-text-bold">Geschäftsname:</span>Anker Technology (UK) Ltd</span></li><li><span class="a-list-item"><span class="a-text-bold">Geschäftsart:</span>Ltd.</span></li><li><span class="a-list-item"><span class="a-text-bold">Handelsregisternummer:</span>8766135</span></li><li><span class="a-list-item"><span class="a-text-bold">UStID:</span>DE295307777</span></li><li><span class="a-list-item"><span class="a-text-bold">Telefonnummer:</span>+49 69 9579 7960</span></li><li><span class="a-list-item"><span class="a-text-bold">Kundendienstadresse:</span><ul class="a-unordered-list a-nostyle a-vertical"><li><span class="a-list-item">610 Nathan Road, Hollywood Commercial Center</span></li><li><span class="a-list-item">Room 1318-19</span></li><li><span class="a-list-item">Hong Kong</span></li><li><span class="a-list-item">Hong Kong</span></li><li><span class="a-list-item">999077</span></li><li><span class="a-list-item">HK</span></li></ul></span></li><li><span class="a-list-item"><span class="a-text-bold">Geschäftsadresse:</span><ul class="a-unordered-list a-nostyle a-vertical"><li><span class="a-list-item">Suite B, Fairgate House, 205 Kings Road, Tyseley,</span></li><li><span class="a-list-item">Birmingham</span></li><li><span class="a-list-item">B11 2AA</span></li><li><span class="a-list-item">GB</span></li></ul></span></li></ul>

我试过这个

post = mhtml.querySelectorAll(".a-list-item .a-text-bold")

这不会给我一个错误,但是如何循环对象元素?

我试过这样的台词

            For Each e In post
            Debug.Print e.innerText
        Next e

但抛出错误

当我调试帖子的innerhtml

Debug.Print post.innerHTML

我在即时窗口“Gesch?ftsname:”中只得到了这个,尽管在我检查 css 选择器.a-list-item .a-text-boldO 时在 html 页面中得到了 7 个结果。

标签: excelvbacss-selectors

解决方案


querySelectorAll 返回一个 nodeList,由于可能的错误,它不能 For Each 结束。相反,您需要使用 .Length 属性进行遍历

Dim i As Long

For i = 0 To post.Length -1
   Debug.Print post.item(i).innerText
Next

如果您希望检索标头列表和关联值,则需要设置对象引用并测试 NextSiblings 的 nodeType:

Option Explicit

Sub GetInfo()
    Dim http As msxml2.XMLHTTP60, html As MSHTML.HTMLDocument, post As Object

    Set http = New msxml2.XMLHTTP60
    Set html = New MSHTML.HTMLDocument

    With http
        .Open "GET", "https://www.amazon.de/sp?_encoding=UTF8&asin=&isAmazonFulfilled=1&isCBA=&marketplaceID=A1PA6795UKMFR9&orderID=&seller=A2PGPJL0BBLHLX&tab=&vasStoreID=", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        html.body.innerHTML = .responseText
    End With

    Set post = html.querySelectorAll(".a-list-item .a-text-bold")

    Dim i As Long, sibling As Object, val As Variant

    For i = 0 To post.Length - 1
        Set sibling = post.item(i).NextSibling

        Debug.Print post.item(i).innerText, " = "

        Select Case sibling.NodeType
        Case 3
            val = sibling.NodeValue
        Case 1
            val = sibling.innerText
        End Select

        Debug.Print val
    Next

    Stop

End Sub

推荐阅读