json - 使用 Pandas 将嵌套的 CSV 转换为嵌套的 JSON
问题描述
我有一个这样的数据框
org.iden.account,org.iden.id,adress.city,adress.country,person.name.fullname,person.gender,person.birthYear,subs.id,subs.subs1.birthday,subs.subs1.org.address.country,subs.subs1.org.address.strret1,subs.org.buyer.email.address,subs.org.buyer.phone.number
account123,id123,riga,latvia,laura,female,1990,subs123,1990-12-14T00:00:00Z,latvia,street 1,email1@myorg.com|email2@sanoma.com,+371401234567
account123,id000,riga,latvia,laura,female,1990,subs456,1990-12-14T00:00:00Z,latvia,street 1,email1@myorg.com,+371401234567
account123,id456,riga,latvia,laura,female,1990,subs789,1990-12-14T00:00:00Z,latvia,street 1,email1@myorg.com,+371401234567
我需要将其转换为基于由点(。)分隔的列的嵌套 JSON。所以对于第一行,预期的结果应该是
{
"org": {
"iden": {
"account": "account123",
"id": "id123"
}
},
"address": {
"city": "riga",
"country": "country"
},
"person": {
"name": {
"fullname": laura,
},
"gender": "female",
"birthYear": 1990
},
"subs": {
"id": "subs123",
"subs1": {
"birthday": "1990-12-14T00:00:00Z",
"org": {
"address": {
"country": "latvia",
"street1": "street 1"
}
}
},
"org": {
"buyer": {
"email": {
"address": "email1@myorg.com|email2@sanoma.com"
},
"phone": {
"number": "+371401234567"
}
}
}
}
}
然后当然是所有记录作为一个列表。我尝试使用简单的熊猫.to_json()
,但没有帮助,我得到以下没有我需要的嵌套结构的内容。
[{"org.iden.account":"account123","org.iden.id":"id123","adress.city":"riga","adress.country":"latvia","person.name.fullname":"laura","person.gender":"female","person.birthYear":1990,"subs.id":"subs123","subs.subs1.birthday":"1990-12-14T00:00:00Z","subs.subs1.org.address.country":"latvia","subs.subs1.org.address.strret1":"street 1","subs.org.buyer.email.address":"email1@myorg.com|email2@sanoma.com","subs.org.buyer.phone.number":371401234567},{"org.iden.account":"account123","org.iden.id":"id000","adress.city":"riga","adress.country":"latvia","person.name.fullname":"laura","person.gender":"female","person.birthYear":1990,"subs.id":"subs456","subs.subs1.birthday":"1990-12-14T00:00:00Z","subs.subs1.org.address.country":"latvia","subs.subs1.org.address.strret1":"street 1","subs.org.buyer.email.address":"email1@myorg.com","subs.org.buyer.phone.number":371407654321},{"org.iden.account":"account123","org.iden.id":"id456","adress.city":"riga","adress.country":"latvia","person.name.fullname":"laura","person.gender":"female","person.birthYear":1990,"subs.id":"subs789","subs.subs1.birthday":"1990-12-14T00:00:00Z","subs.subs1.org.address.country":"latvia","subs.subs1.org.address.strret1":"street 1","subs.org.buyer.email.address":"email1@myorg.com","subs.org.buyer.phone.number":371407654321}]
对此的任何帮助将不胜感激!
解决方案
def df_to_json(row):
tree = {}
for item in row.index:
t = tree
for part in item.split('.'):
prev, t = t, t.setdefault(part, {})
prev[part] = row[item]
return tree
>>> df.apply(df_to_json, axis='columns').tolist()
[{'org': {'iden': {'account': 'account123', 'id': 'id123'}},
'adress': {'city': 'riga', 'country': 'latvia'},
'person': {'name': {'fullname': 'laura'},
'gender': 'female',
'birthYear': 1990},
'subs': {'id': 'subs123',
'subs1': {'birthday': '1990-12-14T00:00:00Z',
'org': {'address': {'country': 'latvia', 'strret1': 'street 1'}}},
'org': {'buyer': {'email': {'address': 'email1@myorg.com|email2@sanoma.com'},
'phone': {'number': 371401234567}}}}},
{'org': {'iden': {'account': 'account123', 'id': 'id000'}},
'adress': {'city': 'riga', 'country': 'latvia'},
'person': {'name': {'fullname': 'laura'},
'gender': 'female',
'birthYear': 1990},
'subs': {'id': 'subs456',
'subs1': {'birthday': '1990-12-14T00:00:00Z',
'org': {'address': {'country': 'latvia', 'strret1': 'street 1'}}},
'org': {'buyer': {'email': {'address': 'email1@myorg.com'},
'phone': {'number': 371401234567}}}}},
{'org': {'iden': {'account': 'account123', 'id': 'id456'}},
'adress': {'city': 'riga', 'country': 'latvia'},
'person': {'name': {'fullname': 'laura'},
'gender': 'female',
'birthYear': 1990},
'subs': {'id': 'subs789',
'subs1': {'birthday': '1990-12-14T00:00:00Z',
'org': {'address': {'country': 'latvia', 'strret1': 'street 1'}}},
'org': {'buyer': {'email': {'address': 'email1@myorg.com'},
'phone': {'number': 371401234567}}}}}]
推荐阅读
- karate - 使用独立 karate.jar 运行测试时无法使用 read('classpath:')
- android - 在 Firestore 中删除文档时如何避免 NullPointerException?
- javascript - Javascript window.print() 方法只工作一次
- python - 如何实现获取参数列表或元组列表的代码?
- elasticsearch - 我们如何进行关键不敏感基数聚合?
- java - 在 recyclerview 中调用 textview 和 imageview
- python - numpy 数组:在作为输入的一组行和列的固定窗口内计算 argmax 的有效方法
- aws-lambda - 使用 AWS lambda 步骤调用 java -jar 命令
- asp.net - ASP.NET Core:没有看到详细的错误,即使环境变量设置为开发
- arrays - 在字典中添加新键?