python - 如何读取此 json 并将其转换为 DF?
问题描述
我想将此嵌套的 json 转换为 df。尝试了不同的功能,但没有一个能正常工作。
对我有用的编码是 - encoding = "utf-8-sig"
[{'replayableActionOperationState': 'SKIPPED',
'replayableActionOperationGuid': 'RAO_1037351',
'failedMessage': 'Cannot replay action: RAO_1037351: com.ebay.sd.catedor.core.model.DTOEntityPropertyChange; local class incompatible: stream classdesc serialVersionUID = 7777212484705611612, local class serialVersionUID = -1785129380151507142',
'userMessage': 'Skip all mode',
'username': 'gfannon',
'sourceAuditData': [{'guid': '24696601-b73e-43e4-bce9-28bc741ac117',
'operationName': 'UPDATE_CATEGORY_ATTRIBUTE_PROPERTY',
'creationTimestamp': 1563439725240,
'auditCanvasInfo': {'id': '165059', 'name': '165059'},
'auditUserInfo': {'id': 1, 'name': 'gfannon'},
'externalId': None,
'comment': None,
'transactionId': '0f135909-66a7-46b1-98f6-baf1608ffd6a',
'data': {'entity': {'guid': 'CA_2511202',
'tagType': 'BOTH',
'description': None,
'name': 'Number of Shelves'},
'propertyChanges': [{'propertyName': 'EntityProperty',
'oldEntity': {'guid': 'CAP_35',
'name': 'DisableAsVariant',
'group': None,
'action': 'SET',
'value': 'true',
'tagType': 'SELLER'},
'newEntity': {'guid': 'CAP_35',
'name': 'DisableAsVariant',
'group': None,
'action': 'SET',
'value': 'false',
'tagType': 'SELLER'}}],
'entityChanges': None,
'primary': True}}],
'targetAuditData': None,
'conflictedGuids': None,
'fatal': False}]
到目前为止,这是我尝试过的,还有更多尝试,但这让我尽可能接近。
with open(r"Desktop\Ann's json parsing\report.tsv", encoding='utf-8-sig') as data_file:
data = json.load(data_file)
df = json_normalize(data)
print (df)
pd.DataFrame(df) ## The nested lists are shown as a whole column, im trying to parse those colums - 'failedMessage' and 'sourceAuditData'`I also tried json.loads/json(df) but the output isnt correct.
pd.DataFrame.from_dict(a['sourceAuditData'][0]['data']['propertyChanges'][0]) ##This line will retrive one of the outputs i need but i dont know how to perform it on the whole file.
预期的结果应该是一个 csv/xlsx 文件,其中每一行都有一个列和一个值。
解决方案
对于您的特定示例:
def unroll_dict(d):
data = []
for k, v in d.items():
if isinstance(v, list):
data.append((k, ''))
data.extend(unroll_dict(v[0]))
elif isinstance(v, dict):
data.append((k, ''))
data.extend(unroll_dict(v))
else:
data.append((k,v))
return data
鉴于您问题中的数据存储在变量中example
:
df = pd.DataFrame(unroll_dict(example[0])).set_index(0).transpose()
推荐阅读
- python - Scrapy 记录两次
- django - django中从出生日期算起的平均年龄
- xpath - Xquery:基于多个值获取唯一块
- javascript - I18nPluralPipe 添加偏移量
- cassandra - DSE Cassandra SSL 握手错误:javax.net.ssl.SSLHandshakeException:sun.security.validator.ValidatorException:未找到受信任的证书
- python - 如何在同一行返回多行结果?- ('绘图'整数)
- flutter - showTimePicker 使用主题时,RenderFlex 底部溢出 14 个像素
- python - ValueError:检查输入时出错:预期dense_13_input的形状为(3,),但数组的形状为(1,)
- c++ - QIODevice::read (QFile, "..\widget_drawing\texture.data"): 调用 QFile::close() 时设备未打开
- sql - 如何在pl/sql中打印以点分隔的每一列