python - Python3 - 文本文件内的增量数
问题描述
我有下面的文本文件,其结构是这样的:word
count
product 5
order 4
tracking 1
这意味着该词在输入文档中product
被找到次数。5
我有一个名为 的脚本WordFrequency.py
,用于查找单词以及它们在输入文件中出现的次数:
import re
from collections import Counter
def count_words(file_path):
with open("/Users/oliverbusk/Sites/Sandbox/storage/app/" + file_path, 'r', encoding="utf-8") as f:
matches = re.findall(r'\b[a-zA-Z]{3,}\b', f.read())
wordcount = Counter(matches)
for word in wordcount:
string = word + " " + str(wordcount[word])
write_to_file(string)
def write_to_file(word):
with open("/Dictionaries/eng.txt", "a+") as f:
f.write(word + "\n")
所以基本上,上面将读取输入文件file_path
,并将单词和计数添加到eng.txt
.
但是,每当我运行它时,结果都会被附加到eng.txt
文件中,例如:
product 5
order 4
tracking 1
product 5
order 4
tracking 1
count
相反,如果文件中已经存在单词,我希望它增加, eng.txt
。
解决方案
一种方法是先读取文件的内容,然后增加计数。
前任:
import re
from collections import Counter, defaultdict
def count_words():
#Read Content#
with open("/Dictionaries/eng.txt", "r") as f:
data = defaultdict(int)
for line in f:
key, value = line.strip().split()
data[key] = int(value)
with open("/Users/oliverbusk/Sites/Sandbox/storage/app/" + file_path, 'r', encoding="utf-8") as f:
matches = re.findall(r'\b[a-zA-Z]{3,}\b', f.read())
wordcount = Counter(matches)
for word, count in wordcount.items():
data[word] += count #Increment Count
#Write To File
write_to_file(data)
def write_to_file(data):
with open("/Dictionaries/eng.txt", "w") as f:
for word, count in data.items():
string = word + " " + str(count)
f.write(string + "\n")
推荐阅读
- java - Android 中的 MediaStore.Video.Media.DISPLAY_NAME 在哪里?
- serverless - 调用 API 时无服务器本地离线给出错误
- javascript - 如何使用 react-test-renderer 或其他替代库呈现纯 html 代码?
- python - 使用openCV从图像中提取图形数据
- docker - gitlab smtp 服务产生错误(在 irb_binding' Errno::EADDRNOTAVAIL 中进行救援(无法分配请求的地址 - “localhost”端口 25 的连接(2))
- sql - 检查“为空”对 where 子句参数有什么影响?
- python - 如何从数据框中删除基于 IQR 过滤的单个值
- python - Django:models.BooleanField(default=False) 总是将值保存为 1(True)
- android - Android studio + Firebase - 无法写入数据
- c++ - Google Kickstart 2013 Round B Problem Sudoku Checker 给出了错误的答案,但它正在运行