首页 > 解决方案 > 无法使用 json.json(normalize) 从 JSON 中提取数据

问题描述

我正在尝试从 json api 响应中提取数据,但由于某种原因它无法正常工作。

我的 JSON 响应:

 {
    "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoIAAAAAAGKVYSHXRpZXItMTpSQkhOeVIyWlR4R3JXOV9CXzFGa3VBAAAAAAF9VEkddGllci0xOm50QmsycWdoVEpLQWVMcmg4QjBQZEEAAAAABrMQFh10aWVyLTE6ZkNlMzFxVEZSZG1oRkl3UnlSRUFWdwAAAAAIEPKGHXRpZXItMTpacFFDQXN0NlFaLXc5NmpsNzZieEtBAAAAAAjIva0ddGllci0xOmZldHg0VGVNUUVTWjhfeVpOWlVKT0EAAAAABnInEx10aWVyLTE6MHdfVlJXRjNSZTZvZm13QjlCWVNadwAAAAAGKtdpHXRpZXItMToyMUZVTTN1clJycTh4dWdxNXIxYUhnAAAAAAGQpC8ddGllci0xOmZVMkV3WVF3U3YyRmRnRFAtcWFsb1EAAAAACI4HyB10aWVyLTE6eFBoSXNib0ZScGVPMkk0TjlJbS1HZwAAAAAGizMtHXRpZXItMTp1SzdVUXozSlNIV2tLRzdXOXZOZGdRAAAAAAYqmIkddGllci0xOk83MkNNTkNVVEtTOWg4N25UOFplN3cAAAAAAhEXeR10aWVyLTE6OEhhc1pRTmFRRU9QZ1BYR1c5elVrUQAAAAAH-1M6HXRpZXItMTp1cEFtdUNEUFJRYURRNE9kbUhHT2N3AAAAAAhzX4sddGllci0xOmxGUzYxSng4U3lhTmR0V2Y4cVFrVUEAAAAAAdnZZR10aWVyLTE6cjBIOHNobU1RYjZOWXBDX210bTFiUQAAAAABkKQwHXRpZXItMTpmVTJFd1lRd1N2MkZkZ0RQLXFhbG9RAAAAAAD1SAEddGllci0xOkNFcXNmUi1YVFl5bjV3R0lJT3p0LXcAAAAAAU_0Sh10aWVyLTE6S1JZejNfenBRby1uYU9jSGdkVF9sQQAAAAAH_GSHHXRpZXItMTpheXZacFZWM1M1LXd2UVJgqVDNsU2FRAAAAAAeA5QsddGllci0xOjhiQnR0SXJHUnhPdVRNTUt0MGd5S1EAAAAAByQlcx10aWVyLTE6OVFESDFDZDFSR1NxdkRaTTY0V3BfQQAAAAAB2dlkHXRpZXItMTpyMEg4c2htTVFiNk5ZcENfbXRtMWJRAAAAAAYpVhMddGllci0xOlJCSE55UjJaVHhHclc5X0JfMUZrdUEAAAAABnInFB10aWVyLTE6MHdfVlJXRjNSZTZvZm13QjlCWVNadwAAAAABne9-HXRpZXItMTp1VnMxTHFXOVR0LW5YYnVXendYME1BAAAAAAgjU6sddGllci0xOkZqT29Rd0FJUUdhQUhkalVlMXJjMEEAAAAAAPVIAx10aWVyLTE6Q0Vxc2ZSLVhUWXluNXdHSUlPenQtdwAAAAAIhXNRHXRpZXItMTpHb3ZFNTRoYVQ3bUgwVURRZG1HNS1RAAAAAAD1SAIddGllci0xOkNFcXNmUi1YVFl5bjV3R0lJT3p0LXcAAAAAAOuGJx10aWVyLTE6U1J0d1J3SlZUYldJb3NlcGNtZWZNZwAAAAALcpePHXRpZXItMTpTTU1UeDZzR1FGdXFPR0c0QUp3d2dnAAAAAAE7ZYEddGllci0xOkNDTUVOV3JoUXNTcjZMTnFFU2owT3c=",
    "took": 9,
    "timed_out": false,
    "_shards": {
        "total": 32,
        "successful": 32,
        "skipped": 0,
        "failed": 0
    },
    "_clusters": {
        "total": 3,
        "successful": 3,
        "skipped": 0
    },
    "hits": {
        "total": 7932,
        "max_score": 0,
        "hits": [
            {
                "_index": "tier-1:5070_newlogs_20191206-01",
                "_type": "logs",
                "_id": "igU82W4BM8yD8vy7BLOy",
                "_score": 0,
                "_source": {
                    "status_message": "",
                    "triggerName": "",
                    "event_source": "gclid",
                    "message": null,
                    "gclid": "EAIaIQyobChMIgoaoj4mg5gIVy-R3Ch3QJQqSEAAYASAAEgI7dvD_BwE",
                    "coralogix": {
                        "branchId": "3158ee8e-fee1-48ca-f20e-889487f0b041",
                        "metadata": {
                            "companyId": 5070,
                            "sdkId": null,
                            "category": "app",
                            "className": "",
                            "methodName": "",
                            "severity": 3,
                            "threadId": 2,
                            "applicationName": "deals",
                            "ipAddress": "",
                            "computerName": "host",
                            "processName": null,
                            "subsystemName": "web"
                        },
                        "json_keys": "asctime cid event event_source gclid message msclkid namespace status status_message triggerName user_id",
                        "logId": "3823b47d-f0e2-4c19-9540-863b2d9b78fa",
                        "jsonUuid": "70ccb889-99ce-219f-42d5-42aca2f17803",
                        "templateId": "b18521b1-9bd7-cdf4-c9bf-10775ce600e6",
                        "timestamp": "2019-12-06T03:23:34.513"
                    },
                    "user_id": "1e87af83-7eb8-476f-955a-0a14e2cc63d6",
                    "namespace": "paid_clicks_tracking",
                    "event": "click_session_init",
                    "asctime": "2019-12-06 03:23:34,514",
                    "msclkid": "",
                    "cid": "",
                    "status": ""
                }
            },
            {
                "_index": "tier-1:5070_newlogs_20191206-01",
                "_type": "logs",
                "_id": "grY82W4BuwXMYnFXBR2S",
                "_score": 0,
                "_source": {
                    "status_message": "",
                    "triggerName": "",
                    "event_source": "gclid",
                    "message": null,
                    "gclid": "CjwKCAiAg8qLvBRAbEiwAE_ZzPTXGM47M_idspnpkV6KRkesruH_zToig9rT5tYbCDqEHMCcSRzlp1BoCVvEQAvD_BwE",
                    "coralogix": {
                        "branchId": "3158ee8e-fee1-48ca-f20e-889487f0b041",
                        "metadata": {
                            "companyId": 5070,
                            "sdkId": null,
                            "category": "app",
                            "className": "",
                            "methodName": "",
                            "severity": 3,
                            "threadId": 2,
                            "applicationName": "deals",
                            "ipAddress": "",
                            "computerName": "host",
                            "processName": null,
                            "subsystemName": "web"
                        },
                        "json_keys": "asctime cid event event_source gclid message msclkid namespace status status_message triggerName user_id",
                        "logId": "514bc4fd-97b0-4443-a304-92683a8222f8",
                        "jsonUuid": "70ccb889-99ce-219f-42d5-42aca2f17803",
                        "templateId": "b18521b1-9bd7-cdf4-c9bf-10775ce600e6",
                        "timestamp": "2019-12-06T03:23:34.974"
                    },
                    "user_id": "73974d1b-8484-4cf0-bed2-140a996666dd",
                    "namespace": "paid_clicks_tracking",
                    "event": "click_session_init",
                    "asctime": "2019-12-06 03:23:34,974",
                    "msclkid": "",
                    "cid": "1316194207.1575602567",
                    "status": ""
                }
            }
        ]
    }
}

我正在尝试提取数据>数据中的所有内容,从索引开始直到状态。

dfs = []
item_count = 0
print(f'Actions needed - {times}')
while item_count <= times:
    response = requests.post(url_2, data=json.dumps(data_two), headers=headers)
    response_json = response.json()
    result = pd.io.json.json_normalize(response_json['hits']['hits'])
    item_count += 1
    print(f'Actions completed - {item_count}')
    dfs.append(result)

df = pd.concat(dfs, ignore_index=True)
print(df)

times是我必须调用 API 的次数,每次我将它附加到 dfs 之后,我都会将其写入 DataFrame。

出于某种原因,当我运行它时,它会返回一个空的数据帧,即使我dfs在 while 循环中打印它仍然是空的,不知道为什么,因为 Postman 会返回 JSON,如示例所示。

谢谢你的建议。

标签: pythonjsonpandas

解决方案


  • 您需要进行一些调试或仅打印代码逻辑中的步骤。首先,您在响应中得到什么,将其打印出来 response_json = response.json()
  • 那么下面的操作结果是什么 result = pd.io.json.json_normalize(response_json['hits']['hits']) 打印出来
  • 然后它实际上是附加的(检查大小和内容并打印出来)?dfs.append(结果)

等等

看看我的截图,我拿了你的 JSON 响应(保存在一个文件中)并获取了这些行,看下面的输出(第二张图片的下端)(不确定你需要什么)

在此处输入图像描述

在此处输入图像描述


推荐阅读