python - 如何使用具有不同字典和字典列表的数据来分解 Panda 列
问题描述
我有一个具有不同值集的熊猫数据框,例如第一个是列表或数组以及其他元素或不是
>>> df_3['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record']
0 [{'Internalid': '24348', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3127'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4434'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5545'}]}}, {'Internalid': '24349', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3125'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4268'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5418'}]}}, {'Internalid': '24350', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3122'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel425'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5221'}]}}]
0 {'isDelete': 'false', 'fields': {'field': [{'id': 'S_EAST', 'value': 'N'}, {'id': 'W_EST', 'value': 'N'}, {'id': 'M_WEST', 'value': 'N'}, {'id': 'N_EAST', 'value': 'N'}, {'id': 'LOW_AREYOU_ASSET', 'value': '-1'}, {'id': 'LOW_SWART_PROG', 'value': '-1'}]}}
0 {'isDelete': 'false', 'fields': {'field': {'id': 'LOW_COD_CONDUCT', 'value': '-1'}}}
0 {'isDelete': 'false', 'fields': {'field': [{'id': 'LOW_SUPPLIER_TYPE', 'value': '2'}, {'id': 'LOW_DO_INT_BOTH', 'value': '1'}]}}
我想把它分解成多行。第一行是列表和其他行吗?
>>> type(df_3)
<class 'pandas.core.frame.DataFrame'>
>>> type(df_3['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record'])
<class 'pandas.core.series.Series'>
预期产出 -
{'Internalid': '24348', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3127'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4434'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5545'}]}}
{'Internalid': '24349', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3125'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4268'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5418'}]}}
{'Internalid': '24350', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3122'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel425'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5221'}]}}]
{'isDelete': 'false', 'fields': {'field': [{'id': 'S_EAST', 'value': 'N'}, {'id': 'W_EST', 'value': 'N'}, {'id': 'M_WEST', 'value': 'N'}, {'id': 'N_EAST', 'value': 'N'}, {'id': 'LOW_AREYOU_ASSET', 'value': '-1'}, {'id': 'LOW_SWART_PROG', 'value': '-1'}]}}
{'isDelete': 'false', 'fields': {'field': {'id': 'LOW_COD_CONDUCT', 'value': '-1'}}}
{'isDelete': 'false', 'fields': {'field': [{'id': 'LOW_SUPPLIER_TYPE', 'value': '2'}, {'id': 'LOW_DO_INT_BOTH', 'value': '1'}]}}
我试图炸毁这些列
>>> df_3.explode('integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib64/python3.6/site-packages/pandas/core/frame.py", line 6318, in explode
result = df[column].explode()
File "/usr/local/lib64/python3.6/site-packages/pandas/core/series.py", line 3504, in explode
values, counts = reshape.explode(np.asarray(self.array))
File "pandas/_libs/reshape.pyx", line 129, in pandas._libs.reshape.explode
KeyError: 0
我可以遍历每一行并尝试找出它是否是一个列表并实现某些东西,但它似乎不正确
if str(type(df_3.loc[i,'{}'.format(c)])) == "<class 'list'>":
有什么方法可以对此类数据使用爆炸功能
解决方案
使用 pandas-read-xml 的替代方法
from pandas_read_xml import flatten, fully_flatten
df = flatten(df)
推荐阅读
- c - SO_RCVTIMEO 超时值在 recv_from API 中不生效
- reactjs - 400 Bad Request 错误与烧瓶服务器反应客户端的 socketio
- php - 签名表现形式 - 使用签名图像添加日期和时间
- c# - DoWork(BackgroundWorker)中如何将结果发送到主线程
- json - 有什么方法可以将变形的 json 加载到 python 对象中?
- git - 管道时 Git diff stat 更改
- magento2 - Magneto 2 中的 sequence_product 表中的列 sequence_value 指代
- opencart - 关于与产品一起进入购物车的产品价格
- reactjs - Safari 浏览器给出错误:未处理的承诺拒绝:无效日期
- asp.net-core - 具有本地化字符串的 ASP.NET Core API