python - 如何使用具有相同列名的熊猫规范化 JSON 时间序列
问题描述
我想使用具有以下格式的时间序列数据的 api:
...
"value":[
{
"Key":"bt386",
"ReferenceDate":"2019-07-27T00:00:00Z",
"TargetDate":"2019-07-28T00:00:00Z",
"PublicationDate":null,
"ChangedOn":"2019-07-27T09:36:03.9727098+01:00",
"ValidUntil":"9999-12-
31T23:59:59.9999999Z",
"ValueColumnsNumber":[
{"Key":"FreshSnowDepth","Value":0.000000000},
{"Key":"Precipitation","Value":0.000000000},
{"Key":"RainSnowMelt","Value":0.000000000},
{"Key":"Runoff","Value":31.800000000},
{"Key":"SnowDepth","Value":0.000000000},
{"Key":"SnowDepthNormalPerc","Value":0.000000000},
{"Key":"SnowMelt","Value":0.000000000},
{"Key":"SnowWaterEquivalents","Value":0.000000000},
{"Key":"Temperature","Value":18.450000000}],"ValueColumnsText":
[],"ValueColumnsDateTime":[]},
{
"Key":"bt386",
"ReferenceDate":"2019-07-27T00:00:00Z",
"TargetDate":"2019-07-29T00:00:00Z",
"PublicationDate":null,
"ChangedOn":"2019-07-
27T09:36:03.9727098+01:00",
"ValidUntil":"9999-12-31T23:59:59.9999999Z",
"ValueColumnsNumber":[
{"Key":"FreshSnowDepth","Value":0.000000000},
{"Key":"Precipitation","Value":0.000000000},
{"Key":"RainSnowMelt","Value":0.000000000},
{"Key":"Runoff","Value":28.400000000},
{"Key":"SnowDepth","Value":0.000000000},
{"Key":"SnowDepthNormalPerc","Value":0.000000000},
{"Key":"SnowMelt","Value":0.000000000},
{"Key":"SnowWaterEquivalents","Value":0.000000000},
{"Key":"Temperature","Value":18.750000000}],
"ValueColumnsText":
[],
"ValueColumnsDateTime":[]
}
]
我尝试了以下代码:
d = json.loads(response.content)
timeSeries = json_normalize(data=d['value'],
record_path='ValueColumnsNumber',
meta=['ReferenceDate', 'TargetDate'])
table = timeSeries.pivot_table('Value', ['ReferenceDate', 'TargetDate'],
'Key')
table.reset_index(drop=False, inplace=True)
pd.set_option('display.max_columns', None)
print(table.head(3))
Key ReferenceDate TargetDate FreshSnowDepth
0 2017-03-22T00:00:00Z 2017-03-23T00:00:00Z 2.8
1 2017-03-22T00:00:00Z 2017-03-24T00:00:00Z 7.6
2 2017-03-22T00:00:00Z 2017-03-25T00:00:00Z 0.3
我需要的是还包括字母数字键。
Key CurveKey ReferenceDate TargetDate FreshSnowDepth
0 bt386 2017-03-22T00:00:00Z 2017-03-23T00:00:00Z 2.8
1 bt386 2017-03-22T00:00:00Z 2017-03-24T00:00:00Z 7.6
2 abcde 2017-03-22T00:00:00Z 2017-03-25T00:00:00Z 0.3
timeSeries = json_normalize(data=d['value'],
record_path='ValueColumnsNumber',
meta=['Key', 'ReferenceDate', 'TargetDate'])
当我更改json_normalize()
功能时,出现以下错误:
“ValueError:元数据名称键冲突,需要区分前缀”
为了将 json 转换为所需的格式,我需要做什么?
解决方案
尝试这个:
table = pd.io.json.json_normalize(d, ['value', 'ValueColumnsNumber'], meta=[
['value', 'Key'],
['value', 'ReferenceDate'],
['value', 'TargetDate'],
])
record_path
应该是您想要循环的最深层次。meta
包含您想要抓取的较浅级别的任何内容。
结果:
Key Value value.Key value.ReferenceDate value.TargetDate
0 FreshSnowDepth 0.0 bt386 2019-07-27T00:00:00Z 2019-07-28T00:00:00Z
1 Precipitation 0.0 bt386 2019-07-27T00:00:00Z 2019-07-28T00:00:00Z
2 RainSnowMelt 0.0 bt386 2019-07-27T00:00:00Z 2019-07-28T00:00:00Z
3 Runoff 31.8 bt386 2019-07-27T00:00:00Z 2019-07-28T00:00:00Z
4 SnowDepth 0.0 bt386 2019-07-27T00:00:00Z 2019-07-28T00:00:00Z
推荐阅读
- python - 列表有问题
- firebase - 颤振“firebase_messaging”和“localstorage”版本解决失败
- node.js - TestCafe - 错误:无法建立一个或多个浏览器连接
- java - 在商米 K2 终端上,如何使用 WinDev Mobile 打印(在内置打印机上)?
- flutter - 如何使用颤振保存以供离线访问
- typescript - 在 ASP.Net 3.1 Core MVC Razor Visual Studio 项目中集成 Vue 客户端项目(前端)
- javascript - 打字稿抱怨“解析错误:枚举成员名称不能以小写 'a' 到 'z' 开头”
- r - R在R中的两个数据框的列之间匹配字符串
- slick.js - 在中心模式下跳跃的光滑滑块增加
- c - 按下 F10 或热键时启动 exe 所需的 C 代码