python - 循环遍历 Pandas DF 并将值附加到列表中,该列表是字典的值,其中条件值是键
问题描述
很难为此制作一个简短但具有描述性的标题,但我有一个数据框,其中每一行代表一个角色的行,整个语料库就是整个节目。我创建一个字典,其中键是顶部字符的列表,循环遍历 DF 并将每个对话行附加到它们的键值,我希望将其作为列表
我有一个名为“字符”的列和一个名为“对话”的列:
Character dialogue
PICARD 'You will agree Data that Starfleets
order are...'
DATA 'Difficult? Simply solve the mystery of
Farpoint Station.'
PICARD 'As simple as that.'
TROI 'Farpoint Station. Even the name sounds
mysterious.'
等等等等......有很多次要角色,所以我只想要按对话计数排名前 10 位的角色,所以我有一个名为 major_chars 的列表。我想要一个最终的字典,其中每个字符都是键,值是所有行的巨大列表。我不知道如何附加到设置为每个键的值的空列表中。到目前为止,我的代码是:
char_corpuses = {}
for label, row in df.iterrows():
for char in main_chars:
if row['Character'] == char:
char_corpuses[char] = [row['dialogue']]
但最终结果只是每个角色在语料库中所说的最后一行:
{'PICARD': [' so five card stud nothing wild and the skys the limit'],
'DATA': [' would you care to deal sir'],
'TROI': [' you were always welcome'],
'WORF': [' agreed'],
'Q': [' youll find out in any case ill be watching and if youre very lucky ill drop by to say hello from time to time see you out there'],
'RIKER': [' of course have a seat'],
'WESLEY': [' i will bye mom'],
'CRUSHER': [' you know i was thinking about what the captain told us about the future about how we all changed and drifted apart why would he want to tell us whats to come'],
'LAFORGE': [' sure goes against everything weve heard about not polluting the time line doesnt it'],
'GUINAN': [' thank you doctor this looks like a great racquet but er i dont play tennis never have']}
我如何让它不清除之前的每一行,只为每个字符取最后一行
解决方案
试试这样的^^
char_corpuses = {}
for char in main_chars:
char_corpuses[char] = df[df.name == char]['dialogue'].values
推荐阅读
- django - 有没有办法总是按 django-tables2 中的特定列对表进行排序?
- amazon-ec2 - Ubuntu 上的 Hyperledger composer CLI 安装问题
- sql - 如何在两个表之间执行数学运算
- reactjs - Puppeteer - 如何点击内部元素
- r - 在函数下方绘制垂直表面
- python - 连接嵌套列表
- html - 尝试设置 HTML/CSS 表格以适应移动屏幕 - 不工作
- postgresql - Postgresql:保持连接打开或在需要时创建
- python - Tkinter Simple Markdown Parser - 删除 markdown 标签
- python - 从作为守护进程运行的另一个 python 脚本运行 python 脚本