c++ - Extracting a particular data from a CSV file in c++
问题描述
I have written a program to read a CSV file but I'm having some trouble in extracting data from that CSV file in c++. I want to count the no. of columns starting from the 5th column in the 1st row until the last column of the 1st row of the CSV file. I have written the following code to read a CVS file, but I am not sure how shall I count the no. of columns as I have mentioned before. Will appreciate it if anyone could please tell me how shall I go about it?
char* substring(char* source, int startIndex, int endIndex)
{
int size = endIndex - startIndex + 1;
char* s = new char[size+1];
strncpy(s, source + startIndex, size); //you can read the documentation of strncpy online
s[size] = '\0'; //make it null-terminated
return s;
}
char** readCSV(const char* csvFileName, int& csvLineCount)
{
ifstream fin(csvFileName);
if (!fin)
{
return nullptr;
}
csvLineCount = 0;
char line[1024];
while(fin.getline(line, 1024))
{
csvLineCount++;
};
char **lines = new char*[csvLineCount];
fin.clear();
fin.seekg(0, ios::beg);
for (int i=0; i<csvLineCount; i++)
{
fin.getline(line, 1024);
lines[i] = new char[strlen(line)+1];
strcpy(lines[i], line);
};
fin.close();
return lines;
}
I have attached a few lines from the CSV file:-
Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0
What I need is, in the 1st row, the number of dates after Long.
解决方案
To answer your question:
I have attached a few lines from the CSV file:- Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0
What I need is, in the 1st row, the number of dates after Long.
Yeah, not that difficult - that's how I would do it:
#include <iostream>
#include <string>
#include <fstream>
#include <regex>
#define FILENAME "test.csv" //Your filename as Macro
//(The compiler just sees text.csv instead of FILENAME)
void read(){
std::string n;
//date format pattern %m/%dd/%YY
std::regex pattern1("\\b\\d{1}[/]\\d{2}[/]\\d{2}\\b");
//date format pattern %mm/%dd/%YY
std::regex pattern2("\\b\\d{2}[/]\\d{2}[/]\\d{2}\\b");
std::smatch result1, result2;
std::ifstream file(FILENAME, std::ios::in);
if ( ! file.is_open() )
{
std::cout << "Could not open file!" << '\n';
}
do{
getline(file,n,',');
//https://en.cppreference.com/w/cpp/string/basic_string/getline
if(std::regex_search(n,result1,pattern1))
std::cout << result1.str(1) << n << std::endl;
if(std::regex_search(n,result2,pattern2))
std::cout << result2.str(1) << n << std::endl;
}
while(!file.eof());
file.close();
}
int main ()
{
read();
return 0;
}
The file test.csv contains the following for testing:
Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20, ,Afghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Albania,41.1533,20.1683,0,0,0,0
Province/State,Country/Region,Lat,Long,1/25/20,12/26/20,1/27/20, ,Bfghanistan,33.0,65.0,0,0,0,0,0,0,0, ,Blbania,41.1533,20.1683,0,0,0,0
It actually is pretty simple:
getline
takes the open file and "escapes" at a so called escape-charachter, in your case a comma ','. (That is the very best way I found in reading csv - you can replace it with whatever you want, for example: ';' or ' ' or '...' - guess you get the drill)After this you got all data nicely separated underneath one another without a comma.
Now you can "filter" out what you need. I use regex - but use what ever you want. (Just fyi: For c++ tagged questions you shouldn't use c-style like strncpy..)
I gave you an example for 1.23.20 (m/dd/yy) and to make it simple if your file contains a november or december like 12.22.20 (mm/dd/yy) to make the regex pattern more easy to read/understand in 2 lines.
you can/may have to expand the regex pattern if the data somehow matches your date format in the file, really good explained here and not as complicated as it looks.
- From that point you can put all the printed stuff f.e. in a vector (some more convenient array) to handle and/or pass/return data - that's up to you.
If you need more explaining I am happy to help you out and/or expand this example, just leave a comment.
推荐阅读
- javascript - jQuery nth-child 相对于前一个选择器
- docker - Kubernetes (Docker for Mac) 上的 Apache Spark 无法解析我的 VPN 主机
- powershell - 在 Win32_Service StartName 报告中重复服务器名称
- jquery - 使用 jquery attr 设置 css 这个 attr 值?
- java - 在 for 循环内部或外部添加条件以获得更清晰的代码更好吗?
- html - 如何覆盖标签 - 引导 css 脚手架.less
- java - JavaFX 的媒体播放器不检索元数据
- python - Python 使用 webbroswer.open() 下载文件
- python - 通过布尔索引将 Pandas DataFrame 条目设置为元组值
- javascript - 尽管有绑定,但异步函数未定义