首页 > 解决方案 > 如何从系列中有重复名称的 Pandas DF 创建字典

问题描述

我有一个 Pandas DF,其中包含音乐专辑的名称和各种信息。同一音乐人有多张唱片。我想从中生成一个字典,其中键 = 艺术家姓名,值 = 艺术家的专辑列表:

示例 pandas df 如下所示:


          artist                                      album
0           A-ha  Headlines And Deadlines: The Hits Of A-Ha
1           Abba                       Greatest Hits Vol. 2
2            abc                        The Lexicon Of Love
3          AC/DC                              Back In Black
4          AC/DC                            Highway to Hell
5  All About Eve                              All About Eve
6   Jon Anderson                         Olias of Sunhillow
7   Jon Anderson                              Song of Seven

我想要的输出是:

output = {
'A-ha': ['Headlines And Deadlines: The Hits Of A-Ha'], 
'Abba': ['Greatest Hits Vol. 2'], 
'abc': ['The Lexicon Of Love'], 
'AC/DC': [['Back In Black'],['Highway to Hell']],
'All About Eve': ['All About Eve'], 
'Jon Anderson': [['Olias of Sunhillow'],['Song of Seven']]
}

我尝试循环遍历数据框和 df.to.dict 选项,但我无法产生所需的输出。

我从熊猫那里得到这个警告:用户警告:数据帧列不是唯一的,一些列将被省略。

谢谢

标签: pythonpandasdictionary

解决方案


您可以groupby申请list和转换to_dict

df.groupby('artist')['album'].apply(list).to_dict()

输出:

{'A-ha': ['Headlines And Deadlines: The Hits Of A-Ha'],
 'AC/DC': ['Back In Black', 'Highway to Hell'],
 'Abba': ['Greatest Hits Vol. 2'],
 'All About Eve': ['All About Eve'],
 'Jon Anderson': ['Olias of Sunhillow', 'Song of Seven'],
 'abc': ['The Lexicon Of Love']}

PS你真的想要一个列表列表,比如[['Back In Black'],['Highway to Hell']]我上面的输出中的列表或字符串列表:['Back In Black','Highway to Hell']


推荐阅读