python - 如何计算每个句子得分中每个单词在句子中的出现次数?
问题描述
我有一份用户调查文件:
Score Comment
8 Rapid bureaucratic affairs. Reports for policy...
4 There needs to be communication or feed back f...
7 service is satisfactory
5 Good
5 There is no
10 My main reason for the product is competition ...
9 Because I have not received the results. And m...
5 no reason
我想确定哪些关键字对应较高的分数,哪些关键字对应较低的分数。
我的想法是构建一个单词表(或“单词向量”字典),其中将包含与之关联的分数,以及该分数与该句子关联的次数。
类似于以下内容:
Word Score Count
Word1: 7 1
4 2
Word2: 5 1
9 1
3 2
2 1
Word3: 9 3
Word4: 8 1
9 1
4 2
... ... ...
然后,对于每个单词,平均分数是与该单词相关的所有分数的平均值。
为此,我的代码如下:
word_vec = {}
# col 1 is the word, col 2 is the score, col 3 is the number of times it occurs
for i in range(len(data)):
sentence = data['SurveyResponse'][i].split(' ')
for word in sentence:
word_vec['word'] = word
if word in word_vec:
word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':(word_vec[word]['NumberOfTimes'] += 1)}
else:
word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':1}
但是这段代码给了我以下错误:
File "<ipython-input-144-14b3edc8cbd4>", line 9
word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':(word_vec[word]['NumberOfTimes'] += 1)}
^
SyntaxError: invalid syntax
有人可以告诉我正确的方法吗?
解决方案
试试这段代码
word_vec = {}
# col 1 is the word, col 2 is the score, col 3 is the number of times it occurs
for i in range(len(data)):
sentence = data['SurveyResponse'][i].split(' ')
for word in sentence:
word_vec['word'] = word
if word in word_vec:
word_vec[word]['Score'] += data['SCORE'][i] # Keep accumulating the total score for each word, would be easier to find the average score later on
word_vec[word]['NumberOfTimes'] += 1
else:
word_vec[word] = {'Score':data['SCORE'][i], 'NumberOfTimes':1}
要增加“NumberOfTimes”的值,您可以像这样直接增加word_vec[word]['NumberOfTimes'] += 1
推荐阅读
- angular - Angular Routing - 如何正确使用通配符路由,用于像 xxx/xxx 这样的嵌套路由?
- javascript - 如何使用另一个数组获取对象数组中的值 - Javascript
- javascript - 单击“添加行”按钮后添加新行(html 表单)
- docker - Openshift 在创建应用程序时传递 Dockerfile 或 Docker Image 环境变量
- react-native - 任务':app:mergeReleaseResources'构建APK的React Native错误执行失败
- owncloud - “邮件模板编辑器”(ownCloud)的邮件模板不读取创建的模板
- ruby-on-rails - 是什么让 sidekiq 工作人员产生错误甚至不可能的方法响应
- spring-boot - 使用参数在 WebLogic 上运行的 Spring Boot 应用程序
- go - 如何使用 https 和 socks4 代理
- linux - 从远程会话中运行 ssh-copy-id 失败