首页 > 解决方案 > Iterating through list of lists and count matches with different list

问题描述

I am new to python and I am working currently on sentiment analysis for my Master thesis. However, there is this problem I am currently working on where I don't really know how to solve it.

I need to find a sentence in a string that contains the word BLA and then compare every word in the sentence with my POSITIVE and NEGATIVE words dictionary. If there are more negative words than positive, the counter should do +1. In the end, I would have something like: in file 1, there are 4 negative sentences that include the word BLA.

So far I used regular expressions to delete all sentences that do not include the word BLA. Then I separated the words within the sentences and created a list of lists. It looks e.g. like that:

[['we', 'underperform', 'because', 'of', 'BLA'], ['BLA', 'is', 'bad'], ['BLA', 'is', 'good']]

Now I would like to compare every single word with the dictionaries of negative and positive words. As I need to find out if the sentence containing the word BLA is rather positive or negative, it is important that I only count this within one list in the list of lists before moving to the second one.

The result should be 2 for this particular example as 2 sentences are negative and one is positive.

In other cases where I only look for e.g. negative words within the text, I do it this way:

# Reset the number of negative words to zero
negative_count=0

# For each negative word, count the number of occurrences
for j in range(len(negative_words)):

    negative_count=negative_count+text_devided.count(negative_words[j])

So I would probably do this but within a loop that goes over the lists.

If you have an idea how to approach this problem differently I am also open for that.

标签: pythonlistloopsfrequencysentiment-analysis

解决方案


我猜你的意思是你的字典。

...每个单词都带有否定词和肯定词的字典。

一个python列表。
所以要做到这一点,我会这样做:

list_with_sentences = [['we', 'underperform', 'because', 'of', 'BLA'], ['BLA', 'is', 'bad'], ['BLA', 'is', 'good']]
pos_words = 0
neg_words = 0
total_neg_count = 0
for sentence in list_with_sentences:  
    for word in sentence:  
        for item in dictonary_pos_word:
            if item == word:
               pos_words = pos_words + 1

        for item in dictonary_neg_word:
            if item == word:
               neg_words = neg_words + 1

        if neg_words > pos_words:
           total_neg_count = total_neg_count + 1

推荐阅读