python - 将 Json 转换为 Pandas 数据框
问题描述
我有这种 json,我会将其转换为带有特定列名的 pandas 数据框。
{
"data": [
{
"id": 1,
"name": "3Way Result",
"suspended": false,
"bookmaker": {
"data": [
{
"id": 27802,
"name": "Ladbrokes",
"odds": {
"data": [
{
"label": "1",
"value": "1.61",
"probability": "62.11%",
"dp3": "1.610",
"american": -164,
"factional": null,
"winning": null,
"handicap": null,
"total": null,
"bookmaker_event_id": null,
"last_update": {
"date": "2021-10-01 16:41:27.000000",
"timezone_type": 3,
"timezone": "UTC"
}
},
{
"label": "X",
"value": "3.90",
"probability": "25.64%",
"dp3": "3.900",
"american": 290,
"factional": null,
"winning": null,
"handicap": null,
"total": null,
"bookmaker_event_id": null,
"last_update": {
"date": "2021-10-01 16:41:27.000000",
"timezone_type": 3,
"timezone": "UTC"
}
},
{
"label": "2",
"value": "5.20",
"probability": "19.23%",
"dp3": "5.200",
"american": 420,
"factional": null,
"winning": null,
"handicap": null,
"total": null,
"bookmaker_event_id": null,
"last_update": {
"date": "2021-10-01 16:41:27.000000",
"timezone_type": 3,
"timezone": "UTC"
}
}
]
}
},
{
"id": 70,
"name": "Pncl",
"odds": {
"data": [
{
"label": "1",
"value": "1.65",
"probability": "60.61%",
"dp3": "1.645",
"american": -154,
"factional": null,
"winning": null,
"handicap": null,
"total": null,
"bookmaker_event_id": null,
"last_update": {
"date": "2021-10-01 16:59:18.000000",
"timezone_type": 3,
"timezone": "UTC"
}
},
{
"label": "X",
"value": "4.20",
"probability": "23.81%",
"dp3": "4.200",
"american": 320,
"factional": null,
"winning": null,
"handicap": null,
"total": null,
"bookmaker_event_id": null,
"last_update": {
"date": "2021-10-01 16:59:18.000000",
"timezone_type": 3,
"timezone": "UTC"
}
},
{
"label": "2",
"value": "5.43",
"probability": "18.42%",
"dp3": "5.430",
"american": 443,
"factional": null,
"winning": null,
"handicap": null,
"total": null,
"bookmaker_event_id": null,
"last_update": {
"date": "2021-10-01 16:59:18.000000",
"timezone_type": 3,
"timezone": "UTC"
}
}
]
}
}
]
}
}
],
"meta": {
"plans": [
{
"name": "Football Free Plan",
"features": "Standard",
"request_limit": "180,60",
"sport": "Soccer"
}
],
"sports": [
{
"id": 1,
"name": "Soccer",
"current": true
}
]
}
}
所有列名称包含博彩公司的名称加上标签值。我会采用 label 中的值并将其用作列名和name
. 然后将其用作数据框的float
行value
这里是预期的输出
1_LadBrokes X_LadBrokes 2_LadBrokes last_update_LadBrokes 1_Pncl X_Pncl 2_Pncl last_update_Pncl
0 1.61 3.9 5.2 2021-10-01 16:41:27.000000 1.65 4.2 5.43 2021-10-01 16:59:18.000000
解决方案
您可以使用json_normalize
+来实现它apply
。
def set_values(x):
data = x["odds.data"]
label = data.get("label")
value = data.get("value")
last_update_date = data.get("last_update").get("date")
name = x["name"]
x[f"{label}_{name}"] = value
x[f"last_update_{name}"] = last_update_date
return x
df = (
pd.json_normalize(data["data"], record_path=["bookmaker", "data"])
.explode("odds.data")
.apply(lambda x: set_values(x), axis=1)
.drop(["odds.data", "id", "name"], axis=1)
.ffill()
.bfill()
.head(1)
)
In [39]: df
Out[39]:
1_Ladbrokes 1_Pncl 2_Ladbrokes 2_Pncl X_Ladbrokes X_Pncl last_update_Ladbrokes last_update_Pncl
0 1.61 1.65 5.20 5.43 3.90 4.20 2021-10-01 16:41:27.000000 2021-10-01 16:59:18.000000
推荐阅读
- json - 在 Chart.js 折线图中绘制多个 JSON 子对象(以时间为 x 轴)
- crystal-reports - Crystal Reports 按参数格式化值
- java - 如何从另一个类中调用带有整数的方法?
- jekyll - Github Jekyll 页面行为不正常
- javascript - 将 js 文件编译到另一个目录没有按预期工作
- python - 黑色格式化程序 - Python
- java - 如何开始从提供的 html 文件中提取信息
- javascript - React Native Redux 调度不起作用但没有错误
- javascript - React Router 仅在刷新页面后才起作用
- angular - 无法理解,为什么 SVG 的某些部分在从角度组件生成时最初没有绘制?