nlp - 无法更新 VADER 词典
问题描述
print(news['title'][5])
秘鲁-厄瓜多尔边境地区发生7.5级地震
print(analyser.polarity_scores(news['title'][5]))
{'neg':0.0,'neu':1.0,'pos':0.0,'compound':0.0}
from nltk.tokenize import word_tokenize, RegexpTokenizer
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
sentence = news['title'][5]
tokenized_sentence = nltk.word_tokenize(sentence)
pos_word_list=[]
neu_word_list=[]
neg_word_list=[]
for word in tokenized_sentence:
if (analyzer.polarity_scores(word)['compound']) >= 0.1:
pos_word_list.append(word)
elif (analyzer.polarity_scores(word)['compound']) <= -0.1:
neg_word_list.append(word)
else:
neu_word_list.append(word)
print('Positive:',pos_word_list)
print('Neutral:',neu_word_list)
print('Negative:',neg_word_list)
score = analyzer.polarity_scores(sentence)
print('\nScores:', score)
正面:[] 中性:['Magnitude', '7.5', 'quake', 'hits', 'Peru-Ecuador', 'border', 'region', '-', 'The', 'Hindu'] 负面:[]
分数:{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
new_words = {
'Peru-Ecuador': -2.0,
'quake': -3.4,
}
analyser.lexicon.update(new_words)
print(analyzer.polarity_scores(sentence))
{'neg':0.0,'neu':1.0,'pos':0.0,'compound':0.0}
from nltk.tokenize import word_tokenize, RegexpTokenizer
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
sentence = news['title'][5]
tokenized_sentence = nltk.word_tokenize(sentence)
pos_word_list=[]
neu_word_list=[]
neg_word_list=[]
for word in tokenized_sentence:
if (analyzer.polarity_scores(word)['compound']) >= 0.1:
pos_word_list.append(word)
elif (analyzer.polarity_scores(word)['compound']) <= -0.1:
neg_word_list.append(word)
else:
neu_word_list.append(word)
print('Positive:',pos_word_list)
print('Neutral:',neu_word_list)
print('Negative:',neg_word_list)
score = analyzer.polarity_scores(sentence)
print('\nScores:', score)
正面:[] 中性:['Magnitude', '7.5', 'quake', 'hits', 'Peru-Ecuador', 'border', 'region', '-', 'The', 'Hindu'] 负面:[]
分数:{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
解决方案
您使用的代码绝对没问题。更新您使用的字典analyser
而不是analyzer
(不知道为什么您没有收到错误)。
new_words = {
'Peru-Ecuador': -2.0,
'quake': -3.4,
}
analyzer.lexicon.update(new_words)
print(analyzer.polarity_scores(sentence))
输出:
{'neg': 0.355, 'neu': 0.645, 'pos': 0.0, 'compound': -0.6597}
还有一点要注意(不确定您是否犯了这个错误。)您不应该再次导入该库。因为您更新的单词将消失。步骤应该是:
- 导入库和字典
- 更新字典(此步骤后您不应再次导入库)
- 计算情绪分数
推荐阅读
- javascript - 如何合并具有相同键的多个数组对象?
- node.js - 如何告诉我的 linter 用警告标记 require() 语句
- java - SLF4J 记录器未记录
- python - 将以下 .py 脚本作为模块导入时如何传递参数?
- ios - 上传到 TestFlight 时出现 React Native ERROR ITMS-90045
- reactjs - eventListener 不适用于 react-router-dom 递归路由
- palantir-foundry - 在 Foundry 中,如何解析具有 JSON 响应的数据框列
- spring-boot - 是否有任何 Elastic Search appender 可以在不使用 ELK 堆栈的情况下直接将 spring-boot 应用程序日志发送(存储)到 Elastic Search
- angular - Angular 2:单击特定页面图标后隐藏侧边栏菜单
- python - Azure 函数错误:解析响应正文时出现错误“SyntaxError:JSON 中位置 0 的意外令牌 S” - 服务不可用