python - Iterating through list of lists and count matches with different list
问题描述
I am new to python and I am working currently on sentiment analysis for my Master thesis. However, there is this problem I am currently working on where I don't really know how to solve it.
I need to find a sentence in a string that contains the word BLA and then compare every word in the sentence with my POSITIVE and NEGATIVE words dictionary. If there are more negative words than positive, the counter should do +1. In the end, I would have something like: in file 1, there are 4 negative sentences that include the word BLA.
So far I used regular expressions to delete all sentences that do not include the word BLA. Then I separated the words within the sentences and created a list of lists. It looks e.g. like that:
[['we', 'underperform', 'because', 'of', 'BLA'], ['BLA', 'is', 'bad'], ['BLA', 'is', 'good']]
Now I would like to compare every single word with the dictionaries of negative and positive words. As I need to find out if the sentence containing the word BLA is rather positive or negative, it is important that I only count this within one list in the list of lists before moving to the second one.
The result should be 2 for this particular example as 2 sentences are negative and one is positive.
In other cases where I only look for e.g. negative words within the text, I do it this way:
# Reset the number of negative words to zero
negative_count=0
# For each negative word, count the number of occurrences
for j in range(len(negative_words)):
negative_count=negative_count+text_devided.count(negative_words[j])
So I would probably do this but within a loop that goes over the lists.
If you have an idea how to approach this problem differently I am also open for that.
解决方案
我猜你的意思是你的字典。
...每个单词都带有否定词和肯定词的字典。
一个python列表。
所以要做到这一点,我会这样做:
list_with_sentences = [['we', 'underperform', 'because', 'of', 'BLA'], ['BLA', 'is', 'bad'], ['BLA', 'is', 'good']]
pos_words = 0
neg_words = 0
total_neg_count = 0
for sentence in list_with_sentences:
for word in sentence:
for item in dictonary_pos_word:
if item == word:
pos_words = pos_words + 1
for item in dictonary_neg_word:
if item == word:
neg_words = neg_words + 1
if neg_words > pos_words:
total_neg_count = total_neg_count + 1
推荐阅读
- php - 会话变量值未定义
- node.js - 前端、后端和云数据库之间的过程
- angular - Angular 组件条件仅在页面重新加载时显示
- jquery - 使用 jQuery 向元素添加类
- xaml - Xamarin 在一行的每一端有两个标签
- python - Pygame 随机坐标生成
- wordpress - 移动版中未配置的间距元素
- javascript - 在 Google 表单中嵌入颜色选择器?
- python - R / Python置信区间
- javascript - Jmeter Chromedriver 错误:未知错误:从远程主机执行时,DevToolsActivePort 文件不存在