首页 > 解决方案 > Extracting a particular data from a CSV file in c++

问题描述

I have written a program to read a CSV file but I'm having some trouble in extracting data from that CSV file in c++. I want to count the no. of columns starting from the 5th column in the 1st row until the last column of the 1st row of the CSV file. I have written the following code to read a CVS file, but I am not sure how shall I count the no. of columns as I have mentioned before. Will appreciate it if anyone could please tell me how shall I go about it?

char* substring(char* source, int startIndex, int endIndex)
{
int size = endIndex - startIndex + 1;
char* s = new char[size+1];
strncpy(s, source + startIndex, size); //you can read the documentation of strncpy online
s[size]  = '\0'; //make it null-terminated
return s;
}

char** readCSV(const char* csvFileName, int& csvLineCount)
{
ifstream fin(csvFileName);
if (!fin)
{
    return nullptr;
}
csvLineCount = 0;
char line[1024];
while(fin.getline(line, 1024))
{
    csvLineCount++;
};
char **lines = new char*[csvLineCount];
fin.clear();
fin.seekg(0, ios::beg);
for (int i=0; i<csvLineCount; i++)
{
    fin.getline(line, 1024);
    lines[i] = new char[strlen(line)+1];
    strcpy(lines[i], line);

};
fin.close();
return lines;
}

I have attached a few lines from the CSV file:-

Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0

What I need is, in the 1st row, the number of dates after Long.

标签: c++csv

解决方案


To answer your question:

I have attached a few lines from the CSV file:- Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0

What I need is, in the 1st row, the number of dates after Long.

Yeah, not that difficult - that's how I would do it:

#include <iostream>
#include <string>
#include <fstream>
#include <regex>

#define FILENAME "test.csv" //Your filename as Macro 
//(The compiler just sees text.csv instead of FILENAME)



void read(){
std::string n;

//date format pattern %m/%dd/%YY
std::regex pattern1("\\b\\d{1}[/]\\d{2}[/]\\d{2}\\b");
//date format pattern %mm/%dd/%YY
std::regex pattern2("\\b\\d{2}[/]\\d{2}[/]\\d{2}\\b");
std::smatch result1, result2;

std::ifstream file(FILENAME, std::ios::in);
    if ( ! file.is_open() )
    {
        std::cout << "Could not open file!" << '\n';
    }

    do{
            getline(file,n,',');
            //https://en.cppreference.com/w/cpp/string/basic_string/getline
            if(std::regex_search(n,result1,pattern1))
                    std::cout << result1.str(1) << n <<  std::endl;
            if(std::regex_search(n,result2,pattern2))
                    std::cout << result2.str(1) << n <<  std::endl;
    }
    while(!file.eof());
    file.close();
}

int main ()
{
    read();
    return 0;
}

The file test.csv contains the following for testing:

    Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0 
    Province/State,Country/Region,Lat,Long,1/25/20,12/26/20,1/27/20, ,Bfghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Blbania,41.1533,20.1683,0,0,0,0 

It actually is pretty simple:

  1. getline takes the open file and "escapes" at a so called escape-charachter, in your case a comma ','. (That is the very best way I found in reading csv - you can replace it with whatever you want, for example: ';' or ' ' or '...' - guess you get the drill)

  2. After this you got all data nicely separated underneath one another without a comma.

  3. Now you can "filter" out what you need. I use regex - but use what ever you want. (Just fyi: For c++ tagged questions you shouldn't use c-style like strncpy..)

  4. I gave you an example for 1.23.20 (m/dd/yy) and to make it simple if your file contains a november or december like 12.22.20 (mm/dd/yy) to make the regex pattern more easy to read/understand in 2 lines.

you can/may have to expand the regex pattern if the data somehow matches your date format in the file, really good explained here and not as complicated as it looks.

  1. From that point you can put all the printed stuff f.e. in a vector (some more convenient array) to handle and/or pass/return data - that's up to you.

If you need more explaining I am happy to help you out and/or expand this example, just leave a comment.


推荐阅读