首页 > 解决方案 > How to read file and save hyphen using STL C++

问题描述

I have to read text file, convert it to lower case and remove non-alphabetic characters but also need to save hyphen and do not count it as a word. here is my coding. It is counting hyphen as word in UnknownWords . I just want to save hyphen and just only want to count words which are on the left and right side of the hyphen in the .txt.

My output:

110 Known words read
79 Unknown words read //it is because it is counting hyphen as word

Desired output is:

110 Known words read
78 Unknown words read   

Code:

void WordStats::ReadTxtFile(){
    std::ifstream ifile(Filename);
    if(!ifile)
    {
        std::cerr << "Error Opening file " << Filename << std::endl;
        exit(1);
    }
    for (std::string word; ifile >> word; )
    {

        transform (word.begin(), word.end(), word.begin(), ::tolower);
        word.erase(std::remove_if(word.begin(), word.end(), [](char c)
        {
            return (c < 'a' || c > 'z') && c != '\'' && c != '-';
        }),  word.end());
        if (Dictionary.count(word))
        {
            KnownWords[word].push_back(ifile.tellg());
        }
        else
        {
            UnknownWords[word].push_back(ifile.tellg()); 
        }
    }
    //  std::string word; ifile >> word;


    std::cout << KnownWords.size() << " known words read." << std::endl;
    std::cout << UnknownWords.size() << " unknown words read." << std::endl;
}

标签: c++vectorstlifstreamread-write

解决方案


如果您不想放置一个单独的单词"-",请在添加到单词向量之前检查它:

for (std::string word; ifile >> word; )
{

    transform (word.begin(), word.end(), word.begin(), ::tolower);
    word.erase(std::remove_if(word.begin(), word.end(), [](char c)
    {
        return (c < 'a' || c > 'z') && c != '\'' && c != '-';
    }),  word.end());
    if (word.find_first_not_of("-") == string::npos) { // Ignore word that's only hyphens
        continue;
    }
    if (Dictionary.count(word))
    {
        KnownWords[word].push_back(ifile.tellg());
    }
    else
    {
        UnknownWords[word].push_back(ifile.tellg()); 
    }
}

推荐阅读