首页 > 解决方案 > Python Pandas - 从列表中读取 JSON

问题描述

我正在尝试从本网站的图表中抓取数据: https ://www.spglobal.com/spdji/en/indices/equity/sp-bmv-ipc/#overview

我在图表后面找到了 JSON 文件,并尝试使用此代码将其导入 pandas:

import pandas as pd
url = "https://www.spglobal.com/spdji/en/util/redesign/index-data/get-performance-data-for-datawidget-redesign.dot?indexId=92330739&getchildindex=true&returntype=T-&currencycode=MXN&currencyChangeFlag=false&language_id=1"

with urllib.request.urlopen(url) as url:
    data = json.loads(url.read().decode())

df = pd.DataFrame(data, columns=['indexLevelsHolder'])
Data=df.iloc[3 , 0]

通过这样做,我得到了“数据”对象,它是一个包含 JSON 格式的时间序列数据的列表。

[{'effectiveDate': 1309406400000, 'indexId': 92330714, 'effectiveDateInEst': 1309392000000, 'indexValue': 43405.82, 'monthToDateFlag': 'N', 'quarterToDateFlag': 'N', 'yearToDateFlag': 'N', 'oneYearFlag': 'N', 'threeYearFlag': 'N', 'fiveYearFlag': 'N', 'tenYearFlag': 'Y', 'allYearFlag': 'Y', 'fetchedDate': 1626573344000, 'formattedEffectiveDate': '30-Jun-2011'}, .........

问题是我找不到读取此 JSON 数据并获取我需要的列(有效日期和 indexValue)的方法。

有什么办法吗?谢谢

标签: pythonjsonpandasweb-scraping

解决方案


您可以使用pd.json_normalize将 Json 加载到列中:

import json
import urllib
import pandas as pd

url = "https://www.spglobal.com/spdji/en/util/redesign/index-data/get-performance-data-for-datawidget-redesign.dot?indexId=92330739&getchildindex=true&returntype=T-&currencycode=MXN&currencyChangeFlag=false&language_id=1"

with urllib.request.urlopen(url) as url:
    data = json.loads(url.read().decode())

df = pd.json_normalize(data["indexLevelsHolder"]["indexLevels"])
print(df)

印刷:

      effectiveDate   indexId  effectiveDateInEst    indexValue monthToDateFlag quarterToDateFlag yearToDateFlag oneYearFlag threeYearFlag fiveYearFlag tenYearFlag allYearFlag    fetchedDate formattedEffectiveDate
0     1309406400000  92330714       1309392000000  43405.820000               N                 N              N           N             N            N           Y           Y  1626574897000            30-Jun-2011
1     1309492800000  92330714       1309478400000  43693.930000               N                 N              N           N             N            N           Y           Y  1626574897000            01-Jul-2011
2     1309752000000  92330714       1309737600000  43758.130000               N                 N              N           N             N            N           Y           Y  1626574897000            04-Jul-2011
3     1309838400000  92330714       1309824000000  43513.290000               N                 N              N           N             N            N           Y           Y  1626574897000            05-Jul-2011

...and son on.

推荐阅读