首页 > 解决方案 > 宏从 json 解析内容时抛出一个奇怪的错误

问题描述

我正在尝试使用 xmlhttp 请求从网页中抓取某些信息。我感兴趣的信息是 javascript 加密和动态加载的。但是,它们在页面源代码 (CTRL + U) 中可用。

当我使用正则表达式从页面源中挖出该部分并使用 处理相同的部分时 JsonConverter,我收到以下错误:

Run-time error `10001`:
Error parsing JSON:
"text":{"payload":{"

我试过:

Sub GrabRedfinInfo()
    Const siteLink$ = "https://www.redfin.com/TX/Austin/604-Amesbury-Ln-78752/unit-2/home/171045975"
    Dim HTML As HTMLDocument, Http As Object
    Dim jsonObject As Object, jsonStr As Object
    Dim itemStr As Variant, sResp As String

    Set HTML = New HTMLDocument
    Set Http = CreateObject("MSXML2.XMLHTTP")

    With Http
        .Open "Get", siteLink, False
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36"
        .send
        HTML.body.innerHTML = .responseText
        sResp = .responseText
    End With

    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = "reactServerState\.InitialContext = (.*);"
        .MultiLine = True
        Set jsonStr = .Execute(sResp)
    End With
    
    itemStr = jsonStr(0).submatches(0)
    
    Set jsonObject = JsonConverter.ParseJson(Replace(itemStr, "\", ""))
    MsgBox jsonObject("ReactServerAgent.cache")("dataCache")("/stingray/api/home/details/belowTheFold")("res")
End Sub

预期输出:

Active Under Contract
Active
Pending - Taking Backups
Active

下图显示了他们的下落:

https://imgur.com/qcksyZ4

标签: jsonvbaweb-scrapingxmlhttprequest

解决方案


相反,我会将正则表达式更改为更具限制性,并且仅针对管理字符串的事件。我还会更改字符串替换以确保我正在\"".

然后,您最终将事件时间线作为数组/集合。看这里


例子:


代码:

Option Explicit

Public Sub GrabRedfinInfo()
    Const siteLink$ = "https://www.redfin.com/TX/Austin/604-Amesbury-Ln-78752/unit-2/home/171045975"
    Dim HTML As HTMLDocument, Http As Object
    Dim jsonObject As Object, jsonStr As Object
    Dim itemStr As Variant, sResp As String

    Set HTML = New HTMLDocument
    Set Http = CreateObject("MSXML2.XMLHTTP")

    With Http
        .Open "Get", siteLink, False
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.190 Safari/537.36"
        .send
        HTML.body.innerHTML = .responseText
        sResp = .responseText
    End With

    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = """events\\"".(\[.*?\])"
        .MultiLine = True
        Set jsonStr = .Execute(sResp)
    End With
    
    itemStr = jsonStr(0).SubMatches(0)
    
    Set jsonObject = JsonConverter.ParseJson(Replace$(itemStr, "\" & Chr$(34), Chr$(34))) 'Array (collection)
    
    Dim evt As Object
    
    For Each evt In jsonObject
        Debug.Print evt("mlsDescription")
    Next

End Sub
  

推荐阅读