python - 从json数组python中删除冗余键值
问题描述
我有一个 json 文件包含一个对象数组,文件内的数据是这样的:
[
{‘name’: ‘A’,
‘address’: ‘some address related to A’,
‘details’: ‘some details related to A’},
{‘name’: ‘B’,
‘address’: ‘some address related to A’,
‘details’: ‘some details related to B’},
{‘name’: ‘C’,
‘address’: ‘some address related to A’,
‘details’: ‘some details related to C’}
]
我想删除多余的键值,所以输出应该是这样的:
[
{‘name’: ‘A’,
‘address’: ‘some address related to A’,
‘details’: ‘some details related to A’},
{‘name’: ‘B’,
‘details’: ‘some details related to B’},
{‘name’: ‘C’,
‘details’: ‘some details related to C’}
]
所以,我试过这个代码在这个链接中找到它:
import json
with open(‘./myfile.json’) as fp:
data= fp.read()
unique = []
for n in data:
if all(unique_data["address"] != data for unique_data["address"] in unique):
unique.append(n)
#print(unique)
with open(“./cleanedRedundancy.json”, ‘w’) as f:
f.write(unique)
但它给了我这个错误:
TypeError: string indices must be integers
解决方案
我做了有/没有文件支持的解决方案,默认情况下没有,因为你的情况支持文件更改use_files = False
到use_files = True
我的脚本中。
我希望您要删除具有相同 (key, value) 对的重复项。
import json
use_files = False
# Only duplicates with next keys will be deleted
only_keys = {'address', 'complex'}
if not use_files:
fdata = """
[
{
"name": "A",
"address": "some address related to A",
"details": "some details related to A"
},
{
"name": "B",
"address": "some address related to A",
"details": "some details related to B",
"complex": ["x", {"y": "z", "p": "q"}],
"dont_remove": "test"
},
{
"name": "C",
"address": "some address related to A",
"details": "some details related to C",
"complex": ["x", {"p": "q", "y": "z"}],
"dont_remove": "test"
}
]
"""
if use_files:
with open("./myfile.json", 'r', encoding = 'utf-8') as fp:
data = fp.read()
else:
data = fdata
entries = json.loads(data)
unique = set()
for e in entries:
for k, v in list(e.items()):
if k not in only_keys:
continue
v = json.dumps(v, sort_keys = True)
if (k, v) in unique:
del e[k]
else:
unique.add((k, v))
if use_files:
with open("./cleanedRedundancy.json", "w", encoding = 'utf-8') as f:
f.write(json.dumps(entries, indent = 4, ensure_ascii = False))
else:
print(json.dumps(entries, indent = 4, ensure_ascii = False))
输出:
[
{
"name": "A",
"address": "some address related to A",
"details": "some details related to A"
},
{
"name": "B",
"details": "some details related to B",
"complex": [
"x",
{
"y": "z",
"p": "q"
}
],
"dont_remove": "test"
},
{
"name": "C",
"details": "some details related to C",
"dont_remove": "test"
}
]
推荐阅读
- java - Hibernate OneToMany,使用自定义查询获取
- r - 在 ggplot2 中使用 geom_sf 制作的地图上手动插入箱线图
- flutter - Flutter 应用无法在 iOS14 真机上运行
- sql - 在 SQL 中计算组合的重复次数
- python - 如何像代码一样显示熊猫滚动窗口?
- python - 为什么元组在保存到 csv 并重新加载数据帧(熊猫)后变成字符串?
- python - NumPy - 流行数据类型的快速紧凑序列化(到字节)数组
- python - 如何在 Python 中使用 for 循环来创建字符串?
- android - 我尝试从我的应用程序和 Kotlin Android 的 FireStore 中删除 recyclerView 列表。如何在我的适配器中调用删除代码?
- python - 如何在 linux 上安装 Windows Python 包/库并使用它