python - 如何确定从 txt 文件中读取项目的频率并打印项目名称以及项目出现的次数?
问题描述
我正在编写一个小程序,它从一个文本文件中读取,该文件包含我们在杂货店购买的许多物品。这个程序是一个更大的应用程序的一部分,我在其中集成了 Python 和 C++,但为了简单起见,我隔离了应用程序的这一部分,因为它似乎是问题所在。
问题是文本文件中的第一项(Spinach)在 txt 文件中存在 5 次,但程序会打印一些垃圾数据,然后是 Spinach,然后是 1 作为表示 Spinach 这个词在文件中存在的次数的数字。但它应该是 5。在项目列表中,您还可以看到再次打印了 Spinach 一词,但这次数字 4 表示它在 txt 文件中存在的次数。但是 Spinach 这个词应该只打印一次,数字 5 代表它在 txt 文件中存在的时间。例如,Spinash - 5. 查看下图。
我不确定问题是否出在 freq = {} 字典中。请,有人可以帮我找出导致问题的原因吗?请具体一点,因为我刚刚学习 python。请查看以下 .py 文件的代码,并查看 .txt 文件中的项目列表。
预先感谢您的帮助。
应用程序.py
def wordFrequency(item): # This function gets called printed out by WordFrequency , it takes one argument which passes from cpp
count = 0 # this variable is use to count the frequency of the list iitem
with open('items.txt')as myfile: # opening file
lines = myfile.readlines() #reading all the lines of the file
for line in lines:
if(line.strip("\n") == item): # removing the \n from the last
count + 1
myfile.close()
return count
# Display only
def displayWordFrequency():
with open('items.txt')as myfile: # opening file
lines = myfile.readlines()
freq ={} # using dictionary to store the value of the list
for line in lines:
if(line.strip("\n") in freq): # put the condition if the value is present aleady then it will increment it otherwise it will put one for it
freq[line.strip("\n")] += 1 #strip to remove \n which passes as an argument
else:
freq[line.strip("\n")] = 1
for key , value in freq.items(): # loops through dictionary and prints the values
print(f"{key} - {value}") # Key is the string and the value is the integer
myfile.close()
print(displayWordFrequency())
物品.txt
Spinach
Radishes
Broccoli
Peas
Cranberries
Broccoli
Potatoes
Cucumbers
Radishes
Cranberries
Peaches
Zucchini
Potatoes
Cranberries
Cantaloupe
Beets
Cauliflower
Cranberries
Peas
Zucchini
Peas
Onions
Potatoes
Cauliflower
Spinach
Radishes
Onions
Zucchini
Cranberries
Peaches
Yams
Zucchini
Apples
Cucumbers
Broccoli
Cranberries
Beets
Peas
Cauliflower
Potatoes
Cauliflower
Celery
Cranberries
Limes
Cranberries
Broccoli
Spinach
Broccoli
Garlic
Cauliflower
Pumpkins
Celery
Peas
Potatoes
Yams
Zucchini
Cranberries
Cantaloupe
Zucchini
Pumpkins
Cauliflower
Yams
Pears
Peaches
Apples
Zucchini
Cranberries
Zucchini
Garlic
Broccoli
Garlic
Onions
Spinach
Cucumbers
Cucumbers
Garlic
Spinach
Peaches
Cucumbers
Broccoli
Zucchini
Peas
Celery
Cucumbers
Celery
Yams
Garlic
Cucumbers
Peas
Beets
Yams
Peas
Apples
Peaches
Garlic
Celery
Garlic
Cucumbers
Garlic
Apples
Celery
Zucchini
Cucumbers
Onions
解决方案
您可以使用字典理解来实现这一点,循环遍历set
数据以删除重复项。要保持顺序,您必须回顾原始列表
# see question for full list
s = """Spinach
Radishes
Broccoli
Peas
Cranberries
Broccoli
Potatoes
Cucumbers
...
Celery
Zucchini
Cucumbers
Onions"""
s = s.split('\n') # get the data as list
s_dict = {k: s.count(k) for k in set(s)}
original_indices = sorted(map(s.index, set(s)))
print('\n'.join(' - '.join((s[i], str(s_dict[s[i]]))) for i in original_indices))
编辑
如果您正在使用字典并且顺序很重要,那么最好使用标准库集合中的实现。
import collections
s = # defined as above
d = collections.OrderedDict()
for i in s:
if i in d:
d[i] += 1
else:
d[i] = 1
for k, v in d.items():
print(k, '-', v)
输出
Spinach - 5
Radishes - 3
Broccoli - 7
Peas - 8
Cranberries - 10
Potatoes - 5
Cucumbers - 9
Peaches - 5
Zucchini - 10
Cantaloupe - 2
Beets - 3
Cauliflower - 6
Onions - 4
Yams - 5
Apples - 4
Celery - 6
Limes - 1
Garlic - 8
Pumpkins - 2
Pears - 1
推荐阅读
- java - 当我们从一个jsp重定向到另一个jsp时,我想知道在spring boot web app中
- swift - 如何在 swift ios 中弯曲 UIImageview 的边缘?
- python - 如何从python中的json响应中获取元素
- android - 如何在android studio上使用mapbox显示室内地图
- android - MissingPluginException(在 Android 上未找到通道 plugins.flutter.io/firebase_core 上的 Firebase#initializeCore 方法的实现)
- c# - 如何模拟采用或返回 Span 的方法
- javascript - 当我运行 npm run serve 时,角度 5 服务器端渲染:通用错误 TypeError:无法读取未定义的属性“charAt”
- assembly - 链接错误:
.asm : 错误的目标文件 - amazon-web-services - API 仅获取 AWS 中启用 SCP 的区域
- javascript - 样式化取决于长度的取消线