excel - Excel VBA IE 与 XMLHTTP 差异
问题描述
我正在抓取以下网站https://www2.asx.com.au/markets/trade-our-cash-market/overview/indices/real-time-indices
检索澳大利亚股票市场的指数列表。
我正在使用以下代码,它可以工作并返回标题和表数据。
Sub GetIEAsx()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLDiv As MSHTML.IHTMLElement
Dim HTMLTable As MSHTML.IHTMLElement
url = "https://www2.asx.com.au/markets/trade-our-cash-market/overview/indices/real-time-indices"
IE.Navigate url
' Wait while IE loading...
Do While IE.Busy And Not IE.ReadyState = 4
DoEvents
Application.Wait DateAdd("s", 1, Now)
Loop
Set HTMLDoc = IE.document
Set HTMLDiv = HTMLDoc.getElementById("realTimeIndicesWidget")
Set HTMLTable = HTMLDiv.getElementsByTagName("table")(0)
WriteTableToWorksheet HTMLTable
End Sub
Public Sub WriteTableToWorksheet(TableToProcess As MSHTML.IHTMLElement)
Dim TableSection As MSHTML.IHTMLElement
Dim TableRow As MSHTML.IHTMLElement
Dim TableCell As MSHTML.IHTMLElement
Dim td As MSHTML.IHTMLElement
Dim rowNum As Long
Dim colNum As Long
Dim OutPutSheet As Worksheet
rowNum = 0
colNum = 0
Set OutPutSheet = ThisWorkbook.Worksheets.Add
' searh table section for results
For Each TableSection In TableToProcess.Children
For Each TableRow In TableSection.Children
rowNum = rowNum + 1
For Each TableCell In TableRow.Children
colNum = colNum + 1
OutPutSheet.Cells(rowNum, colNum) = TableCell.innerText
Next TableCell
colNum = 0
Next TableRow
Next TableSection
End Sub
但是当我使用 XMLHTTP 来抓取网站时,我得到的是 header(thead) 数据,而不是表 (tbody) 数据。任何帮助将不胜感激。
Sub GetXmlAsx()
Dim XMLRequest As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim HTMLDiv As MSHTML.IHTMLElement
Dim HTMLTable As MSHTML.IHTMLElement
url = "https://www2.asx.com.au/markets/trade-our-cash-market/overview/indices/real-time-indices"
With XMLRequest
.Open "GET", url, False
.send
End With
If XMLRequest.Status <> 200 Then
MsgBox XMLRequest.Status & " - " & XMLRequest.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLRequest.responseText
Set HTMLDiv = HTMLDoc.getElementById("realTimeIndicesWidget")
Set HTMLTable = HTMLDiv.getElementsByTagName("table")(0)
WriteTableToWorksheet HTMLTable
End Sub
解决方案
tbody
通过 xhr 加载 html 将不会加载其中的值。但是你可以使用 xhr 从此链接获取带有值的 JSON:
https ://www.asx.com.au/asx/1/index-info?callback=processASXIndices
Sub GetXmlAsx()
Dim XMLRequest As New MSXML2.XMLHTTP60
Dim url As String
url = "https://www.asx.com.au/asx/1/index-info?callback=processASXIndices"
With XMLRequest
.Open "GET", url, False
.send
End With
If XMLRequest.Status <> 200 Then
MsgBox XMLRequest.Status & " - " & XMLRequest.statusText
Exit Sub
End If
MsgBox XMLRequest.responseText
End Sub
要处理 JSON,可以使用 Tim Hall 在 GitHub 上提供的这个 VBA 模块:
https ://github.com/VBA-tools/VBA-JSON
推荐阅读
- javascript - 在 puppeteer 中等待 page.waitFor inside page.evaluate?
- php - 检查是否使用 mysql 变量设置了 $_POST 变量
- excel - 如果多个单元格值为 True,则显示文本
- javascript - How do I add css class names to text blocks in Squarespace?
- firebase - 子集合是创建玩家“好友列表”的最佳方式吗?
- java - 从 Apache Camel 中的 /home/username 以外的位置进行 SFTP
- python-3.x - 希望在数据结构中存储两个不同的 Web url
- php - 未知的“表单”函数 php symfony/form twig
- c++ - 为什么VScode在.h文件中显示“'iostream'文件未找到”?
- javascript - 如何在角度的api中使用无线电组中的ngModel