python - 在字典列表中查找和编辑值的最快方法
问题描述
想象一下,我有一个这样的字典列表:
[{'artifactID': 8047.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'},
{'artifactID': 8301.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'},
{'artifactID': 8374.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'},
{'artifactID': 8641.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'},
{'artifactID': 8969.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'}
]
这个列表有超过一百万本词典。我想查找memberID
和artifactID
组合,如果组合存在,则在resource_launch
字典下的现有键中添加一些值。
例如:如果我查找 和 的组合artifactID=8969
,memberID=1
我希望在字典列表中更新该记录,如下所示:
[{'artifactID': 8047.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'},
{'artifactID': 8301.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'},
{'artifactID': 8374.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'},
{'artifactID': 8641.0,
'memberID': 1,
'resource_launch': {'key1':'value1','key2':'value2','key3':'value3'},
{'artifactID': 8969.0,
'memberID': 1,
'resource_launch': {'key1':['value1','value4'],'key2':['value2','value5'],'key3':['value3','value6']}
]
如果该组合不存在,我想将其添加到字典列表中。
鉴于此列表将包含超过一百万本词典,我如何快速实现它?现在,我只是添加每条记录,而不检查记录是否存在,这需要很长时间。
任何帮助将不胜感激。
这是我的入门代码,它花费的时间太长并且没有达到我想要的效果:
finalJSON = []
for memberID in tqdm(artifactsTimespent['memberID'].unique()):
temp_artifactsTimespent = artifactsTimespent[artifactsTimespent.memberID==memberID]
for artifactID in temp_artifactsTimespent['artifactID'].unique():
temp_artifactsTimespent = temp_artifactsTimespent[temp_artifactsTimespent.artifactID==artifactID]
temp_resourceLaunchData = resourceLaunchData[resourceLaunchData.member_id==memberID]
temp_resourceLaunchData = temp_resourceLaunchData[temp_resourceLaunchData.artifact_id==artifactID]
temp_artifactsContentData = artifactsContentData[artifactsContentData.member_id==memberID]
temp_artifactsContentData = temp_artifactsContentData[temp_artifactsContentData.artifact_id==artifactID]
temp_artifactRevisions = artifactRevisions[artifactRevisions.artifactid==artifactID]
memberArtifactData = {}
memberArtifactData['memberID'] = memberID
memberArtifactData['artifactID'] = artifactID
memberArtifactData['resource_launch'] = temp_artifactsContentData['content_data']
finalJSON.append(memberArtifactData)
解决方案
推荐阅读
- hive - 在 DataFrame 中不能有调用集合操作的地图类型列
- jquery - 如何在 Django 中实现 toast 消息?
- r - 使用 for 循环递增
- css - 在 Google App Maker 代码编辑器中更改字体大小
- cloudant - cloudant groupby 和 count 值出现的次数
- javascript - 如何使用下拉值过滤django中显示的元素列表
- c++ - 如何删除双向链表中间和末尾的节点
- flask - 手动启动时,守护进程 Celery 工人在 1 项任务上抛出错误,但工人没有
- c++ - Visual C++ 测试开箱即用项目中的错误
- java - 在活动之间发送图像 - Android (Kotlin/Java)