首页 > 解决方案 > 循环遍历 Pandas DF 并将值附加到列表中,该列表是字典的值,其中条件值是键

问题描述

很难为此制作一个简短但具有描述性的标题,但我有一个数据框,其中每一行代表一个角色的行,整个语料库就是整个节目。我创建一个字典,其中键是顶部字符的列表,循环遍历 DF 并将每个对话行附加到它们的键值,我希望将其作为列表

我有一个名为“字符”的列和一个名为“对话”的列:

Character      dialogue
PICARD         'You will agree Data that Starfleets
               order are...'
DATA           'Difficult? Simply solve the mystery of 
               Farpoint Station.'
PICARD         'As simple as that.'
TROI           'Farpoint Station. Even the name sounds
                mysterious.'

等等等等......有很多次要角色,所以我只想要按对话计数排名前 10 位的角色,所以我有一个名为 major_chars 的列表。我想要一个最终的字典,其中每个字符都是键,值是所有行的巨大列表。我不知道如何附加到设置为每个键的值的空列表中。到目前为止,我的代码是:

char_corpuses = {} 
for label, row in df.iterrows():
    for char in main_chars:
        if row['Character'] == char:
            char_corpuses[char] = [row['dialogue']]

但最终结果只是每个角色在语料库中所说的最后一行:

{'PICARD': [' so five card stud nothing wild and the skys the limit'],
 'DATA': [' would you care to deal sir'],
 'TROI': [' you were always welcome'],
 'WORF': [' agreed'],
 'Q': [' youll find out in any case ill be watching and if youre very lucky ill drop by to say hello from time to time see you out there'],
 'RIKER': [' of course have a seat'],
 'WESLEY': [' i will bye mom'],
 'CRUSHER': [' you know i was thinking about what the captain told us about the future about how we all changed and drifted apart why would he want to tell us whats to come'],
 'LAFORGE': [' sure goes against everything weve heard about not polluting the time line doesnt it'],
 'GUINAN': [' thank you doctor this looks like a great racquet but er i dont play tennis never have']}

我如何让它不清除之前的每一行,只为每个字符取最后一行

标签: pythonpandasdictionary

解决方案


试试这样的^^

char_corpuses = {}
for char in main_chars:
  char_corpuses[char] = df[df.name == char]['dialogue'].values

推荐阅读