首页 > 解决方案 > 使用 XMLHTTP 方法时获取表头

问题描述

我有一个从这个 url 获取表格的代码

https://www.reuters.com/companies/AAPL.OQ/financials/income-statement-annual

代码没问题,除了一点之外没有任何问题。代码获取表格但没有获取标题

    With http
    .Open "Get", sURL, False
    .send
    html.body.innerHTML = .responseText
End With

   Set tbl = html.getElementsByTagName("Table")(0)

        For Each rw In tbl.Rows
            r = r + 1: c = 1
            For Each cl In rw.Cells
                ws.Cells(r, c).Value = cl.innerText
                c = c + 1
            Next cl
    Next rw

检查 URL 时,我发现支持 API URL

https://www.reuters.com/companies/api/getFetchCompanyFinancials/AAPL.OQ

如何从 JSON 响应中提取“收入”所需的“年度”数据?

我试图参考我想要的部分,但出现错误

Const strUrl As String = "https://www.reuters.com/companies/api/getFetchCompanyFinancials/AAPL.OQ"

Sub Test()
Dim a, json As Object, colData As Collection, sFile As String, i As Long

With CreateObject("MSXML2.ServerXMLHTTP.6.0")
    .Open "GET", strUrl
    .send
    Set json = JSONConverter.ParseJson(.responseText)
End With


Set colData = json("market_data")("financial_statements")

Stop
End Sub

标签: excelvbaweb-scrapingxmlhttprequest

解决方案


与此类似的逻辑应该在 vba 中工作:

Dim data As Scripting.Dictionary, key As Variant, block As Collection, r As Long, item As Object

Set data = json("market_data")("financial_statements")("financial_statements")("income")("annual") ' dict of collections

r = 1

For Each key In data.keys
    Set block = data(key)  'each block (section of info) is a row
    r = r + 1: c= 2
    For each item In block 'loop columns in block         
        With Activesheet
            If r = 2 then 'write out headers to row 1,starting col2 and then values to row 2 starting from col 2, and key goes in row , col 1
                .Cells(1,c) = item("date")
            End If
            .Cells(r,1) = Key
            .Cells(r,c) = item("value")
        End With
        c = c + 1
    Next
Next

我无法在 VBA 中进行测试,但如果我编写 python(长手)等效项,我会得到同一张表:

import requests
import pandas as pd

json = requests.get('https://www.reuters.com/companies/api/getFetchCompanyFinancials/AAPL.OQ').json()
data = json["market_data"]["financial_statements"]["income"]["annual"]
rows = len(data.keys()) + 1
columns = len(data["Revenue"]) + 1
r = 0
df = pd.DataFrame(["" for c in range(columns)] for r in range(rows))

for key in data.keys():
    block = data[key]
    r+=1 ; c = 1
    for item in block:
        if r == 1:
            df.iloc[0 , c] = item["date"]
        df.iloc[r,c] = item["value"]
        df.iloc[r,0] = key
        c+=1
print(df)

推荐阅读