arrays - Count words in list of strings based on words array and making dictionary from it
问题描述
I have a list of strings as:
string_list=['philadelphia court excessive disappointed court hope','hope jurisdiction obscures acquittal court','mention hope maryland signal held problem internal reform life bolster level grievance']
and a list of words as:
words=['hope','court','mention','maryland']
Now, all I want to get the count of list words occurance within list of strings into seperate dictionary with key as 'doc_(index) and values as nested dictionary with key as occured words and value as counts. Output expected as:
words_dict={'doc_1':{'court':2,'hope':1},'doc_2':{'court':1,'hope':1},'doc_3':{'mention':1,'hope':1,'maryland':1}}
what I did first step as:
docs_dict={}
count=0
for i in string_list:
count+=1
docs_dic['doc_'+str(count)]=i
print (docs_dic)
{'doc_1': 'philadelphia court excessive disappointed court hope', 'doc_2': 'hope jurisdiction obscures acquittal court', 'doc_3': 'mention hope maryland signal held problem internal reform life bolster level grievance'}
After this, I'm not able to get how I can get the word counts. What I did so far as:
docs={}
for k,v in words_dic.items():
split_words=v.split()
for i in words:
if i in split_words:
docs[k][i]+=1
else:
docs[k][i]=0
解决方案
您可以在 python 中使用count来获取句子中的字数。
检查此代码:
words_dict = {}
string_list=['philadelphia court excessive disappointed court hope','hope jurisdiction obscures acquittal court','mention hope maryland signal held problem internal reform life bolster level grievance']
words_list=['hope','court','mention','maryland']
for i in range(len(string_list)): #iterate over string list
helper = {} #temporary dictionary
for word in words_list: #iterate over word list
x = string_list[i].count(word) #count no. of occurrences of word in sentence
if x > 0:
helper[word]=x
words_dict["doc_"+str(i+1)]=helper #add temporary dictionary into final dictionary
#Print dictionary contents
for i in words_dict:
print(i + ": " + str(words_dict[i]))
上述代码的输出是:
doc_3: {'maryland': 1, 'mention': 1, 'hope': 1}
doc_2: {'court': 1, 'hope': 1}
doc_1: {'court': 2, 'hope': 1}
推荐阅读
- google-sheets - 如何将此公式更改为不区分大小写
- java - 如何将 Java txt 中的数据条目数显示到主控制台
- openshift - openshift 错误未知标志:--ssh-privatekey
- javascript-automation - 我可以将 JS 文件(或 JS 文件中的函数)导入 JXA 吗?
- node.js - NodeJS-MSMQ:SyntaxError:不能在模块外使用导入语句
- python - 在 Visual Studio 的一个项目中同时运行两个 .py 文件
- r - R - 在两个 ggplotGrob 对象上使用 rbind 时出错
- twitter-bootstrap - Bootstrap 5 在 textarea 上的浮动标签
- java - label.setText 上的 java.lang.reflect.InvocationTargetException
- r - 以空格作为分隔符并在列名中读取 txt