首页 > 解决方案 > 包含单词列表的列的单词分数总和

问题描述

我有一个单词栏:

> print(df['words'])
0       [awww, thats, bummer, shoulda, got, david, car...   
1       [upset, that, he, cant, update, his, facebook,...   
2       [dived, many, time, ball, managed, save, rest,...   
3       [whole, body, feel, itchy, like, it, on, fire]   
4       [no, it, not, behaving, at, all, im, mad, why,...   
5       [not, whole, crew]

以及每个单词的“情感”值的另一个情感列:

> print(sentiment) 
           abandon  -2
0        abandoned  -2
1         abandons  -2
2         abducted  -2
3        abduction  -2
4       abductions  -2
5            abhor  -3
6         abhorred  -3
7        abhorrent  -3
8           abhors  -3
9        abilities   2
...

对于中的每一行单词,df['words']我想总结它们各自的情感值。对于情绪中不存在的词,等于 0。

这是我到目前为止所拥有的:

df['sentiment_value'] = Sum(df['words'].apply(lambda x: ''.join(x+x for x in sentiment))

预期结果

print(df['sentiment_value'])
0        -5   
1         2   
2        15  
3        -6   
4        -8   
...

标签: pythonstringpandasdataframe

解决方案


如果第二列在字符串中有值,那么您需要首先通过将列转换为两列来过滤数据

df['Sentiment'],df['Sentiment_value']=df.sentiment.str.split(" ")

然后你可以从情感列中找到情感索引并从情感值列中获取值


推荐阅读