首页 > 解决方案 > 计算非文章词

问题描述

试图让 def countnonarticlewords计算 mobydick 文本文件中非文章词的数量。我试图不计算的单词是“a”、“an”和“the”。关于如何修改def countnonarticlewords以输出示例 #1 的任何提示。目前它正在输出示例 #2

示例 #1

******* mobydick.txt ******* 
Total Lines: 15604
Total Chars: 512293
Total Words: 115314
**Total Non-Article Words: 105479**

示例 #2

 Enter the test file name: mobydick.txt
*****mobydick.txt*****
Total Lines: 15604
Total Chars: 512293
Total Words: 115314
**Total Non-Article Words: 15604**  

到目前为止我所拥有的:

def getTextFile():
        filename=input("Enter the test file name: ")
        textFile=open(filename, 'r')
        return filename,textFile

def outputcountresults(filename, linecount, charcount, wordcount, nonarticlewordcount):
    print("*****{}*****".format(filename))
    print("Total Lines: {}".format(linecount))
    print("Total Chars: {}".format(charcount))
    print("Total Words: {}".format(wordcount))
    print("Total Non-Article Words: {}".format(nonarticlewordcount))

def countcharacters(line):
    charcount=0
    for c in line:
        if not c.isspace():
            charcount= charcount +1
    return charcount

def countnonarticlewords(line):
    nonarticlewords= line.split()
    nonarticlewords=0
    for nonarticle in line:
        #nonarticlewords= line.split()
        if not 'a' or 'an' or 'the':
            nonarticlewords= nonarticlewords +1
    return len(nonarticle)

def countwords(line):
    words= line.split()
    return len(words)

def countdocstats(docFile):
    linecount=0
    totalcharacters=0
    totalwords=0
    totalnonarticlewords=0
    for line in docFile:
        linecount= linecount + 1
        totalwords= totalwords + countwords(line)
        totalcharacters=totalcharacters+countcharacters(line)
        totalnonarticlewords= totalnonarticlewords + countnonarticlewords(line)
    return linecount, totalcharacters, totalwords, totalnonarticlewords

def main():
    filename, textFile=getTextFile()
    linecount, totalcharacters, totalwords, totalnonarticlewords= countdocstats(textFile)

    outputcountresults(filename,linecount,totalcharacters,totalwords,totalnonarticlewords)
main()

标签: pythonpython-3.x

解决方案


推荐阅读