首页 > 解决方案 > PreviousSibling 不使用 querySelector 返回值

问题描述

我正在尝试从位于 C:\Sample.html 的本地 html 文件中提取两个部分,并且我使用了来自另一个类似线程的 @QHarr 代码

Sub Test()
Dim html As HTMLDocument, post As Object, i As Long

Set html = GetHTMLFileContent("C:\Sample.html")
Set post = html.querySelectorAll("span.course-player__chapter-item__completion")

For i = 0 To post.Length - 1
    ActiveSheet.Cells(i + 1, 1) = Trim(post.item(i).innerText)
    ActiveSheet.Cells(i + 1, 2) = post.item(i).PreviousSibling.innerText
Next i
End Sub

Function GetHTMLFileContent(ByVal filePath As String) As HTMLDocument
Dim fso As Object, hFile As Object, hString As String, html As HTMLDocument

Set html = New HTMLDocument
Set fso = CreateObject("Scripting.FileSystemObject")
Set hFile = fso.OpenTextFile(filePath)

Do Until hFile.AtEndOfStream
    hString = hFile.ReadAll()
Loop

html.body.innerHTML = hString
Set GetHTMLFileContent = html
End Function

该代码工作正常,并在该部分中获取元素的内部文本post.item(i).innerText。但是,当尝试获取 Previous Sibling 的内部文本时,它不会返回任何内容

这是html的快照

在此处输入图像描述

<div class="course-player__chapter-item__header _chapter-item__header_d57kmg ui-accordion-header ui-corner-top ui-state-default ui-accordion-icons ui-accordion-header-active ui-state-active" role="tab" id="ui-id-1" aria-controls="ui-id-2" aria-selected="true" aria-expanded="true" tabindex="0"><span class="ui-accordion-header-icon ui-icon ui-icon-triangle-1-s"></span>
  <h2 tabindex="-1" class="course-player__chapter-item__title _chapter-item__title_d57kmg">
    <span class="course-player__progress _chapter-item__progress_d57kmg">
      <span data-percentage-completion="100" class="_chapter-item__progress-ring_d57kmg">
        <span class="progress-ring__ring _progress-ring__ring_jgsecr">
	<span class="progress-ring__mask progress-ring--full _progress-ring__mask_jgsecr _progress-ring--full_jgsecr">
		<span class="progress-ring--fill brand-color__background _progress-ring--fill_jgsecr"></span>
	</span>
	<span class="progress-ring__mask progress-ring--half _progress-ring__mask_jgsecr ">
		<span class="progress-ring--fill brand-color__background _progress-ring--fill_jgsecr"></span>
		<span class="progress-ring--fill progress-ring--fix _progress-ring--fill_jgsecr _progress-ring--fix_jgsecr"></span>
	</span>
</span>
<span class="progress-ring__ring-inset _progress-ring__ring-inset_jgsecr"></span>
<span class="progress-ring__checkmark brand-color__text _progress-ring__checkmark_jgsecr"><i aria-label="Completed" class="toga-icon toga-icon-checkmark"></i></span>

      </span>
    </span>

    INTRO TO VBA - Overview

<!---->
    <span class="course-player__chapter-item__completion _chapter-item__completion_d57kmg">
      10 / 10
    </span>

    <span class="course-player__chapter-item__toggle _chapter-item__toggle_d57kmg">
      <i aria-hidden="true" class="chapter-item__toggle-icon toga-icon toga-icon-caret-stroke-down _chapter-item__toggle-icon_d57kmg"></i>
    </span>

  </h2>
</div>

标签: htmlexcelvba

解决方案


我使用了返回所有值的 CSS 选择器,h2[class='course-player__chapter-item__title _chapter-item__title_d57kmg']然后将输出分成两列

Sub Test()
Dim x, html As HTMLDocument, post As Object, s As String, i As Long

Set html = GetHTMLFileContent("C:\Sample.html")
Set post = html.querySelectorAll("h2[class='course-player__chapter-item__title _chapter-item__title_d57kmg']")

For i = 0 To post.Length - 1
    x = Split(Trim(post.item(i).innerText), " ")
    s = Join(Array(x(UBound(x)), x(UBound(x) - 1), x(UBound(x) - 2)), " ")
    ReDim Preserve x(0 To UBound(x) - 3)

    ActiveSheet.Cells(i + 1, 1) = Trim(Join(x, " "))
    ActiveSheet.Cells(i + 1, 2) = Trim(s)
Next i
End Sub

推荐阅读