首页 > 解决方案 > Python字数(2个包含单词的文件)(1个用于字数的文件)(最后一个文件写在他里面字+计数)

问题描述

2个包含单词的txt文件(例如歌词)

1 个 txt 文件,其中包含我想在这 2 个文件中计算的单词^

1个包含单词+计数的txt文件

file1 = open(r'E:\Users\OneDrive\Desktop\python\file1.txt','r')
file2 = open(r'E:\Users\OneDrive\Desktop\python\file2.txt','r')
file3 = open(r'E:\Users\OneDrive\Desktop\python\words.txt','r')
file4 = open(r'E:\Users\OneDrive\Desktop\python\wordsInFiles.txt','w')

for word in file3:
    word = word.strip("\n")
    counter = 0
    counter2 = 0
    for line in file1:
        line = line.strip("\n")
        words = line.split()
        for w in words:
            w = w.strip()
            if(w == word):
                counter += 1
    file1.seek(0,0)
    for line in file2:
        line = line.strip("\n")
        words = line.split()
        for w in words:
            w = w.strip()
            if(w == word):
                counter2 += 1
    file4.write(word + " " + str(counter) + "\n")
    file4.write(word + " " + str(counter2) + "\n")
    file2.seek(0,0)

file1.close()
file2.close()
file3.close()
file4.close()

它为我复制了单词,计数也不正确。

感谢谁的帮助

标签: pythonfilecountword

解决方案


1)计算所有文件中的所有单词

2)查看包含您感兴趣的单词的文件,并查找Counter您从步骤 1 获得的对象中的每个单词

from collections import Counter

input_filenames = [
    r"E:\Users\OneDrive\Desktop\python\file1.txt",
    r"E:\Users\OneDrive\Desktop\python\file2.txt",
]
file_with_words_youre_interested_in = r"E:\Users\OneDrive\Desktop\python\file3.txt"
output_filename = r"E:\Users\OneDrive\Desktop\python\wordsInFiles.txt"


# A generator that yields all the words in a file one by one
def get_words(filename):
    with open(filename) as f:
        for line in f:
            yield from line.split()


filename_to_word_count = {
    filename: Counter(get_words(filename)) for filename in input_filenames
}

with open(file_with_words_youre_interested_in) as f:
    words_to_count = f.read().splitlines()

with open(output_filename, "w") as f:
    for word_to_count in words_to_count:
        for filename in input_filenames:
            f.write(f"{word_to_count} {filename_to_word_count[filename][word_to_count]}\n")

推荐阅读