c++ - 可以将动态数组中的元素存储在 HDF5 的不同列中吗?
问题描述
我正在尝试将来自 C++ 的模拟数据存储在 HDF5 中(稍后我将在 Python + Pandas 中分析这些数据)。我的目标是尝试在 C++ 中正确组织所有数据,所以以后我只需要阅读它。
我的问题是尝试将动态数组存储在 HDF5 的不同列中:我正在使用 H5::VarLenType 来存储数组。我成功了,但是我将数组放在一个列中,这对我来说并不方便:我需要一个列中的每个值。
如果我使用固定大小的数组,但不使用hvl_t
数据类型的临时缓冲区,我可以做到这一点。如果我对可变长度数组使用相同的方法(迭代循环并手动计算偏移量并添加数据类型),我会得到垃圾数据。
这是我的概念证明,稍后我会将其添加到我的项目中。
#include <stddef.h>
#include <cstring>
#include <string>
#include <sstream>
#include <iostream>
#include "H5Cpp.h"
const int MAX_NAME_LENGTH = 32;
const int N_PLACES = 3;
const int N_ROWS = 3;
const std::string FileName("SimulationResults-test.h5");
const std::string DatasetName("SimulationData");
const std::string member_simulation("Simulation");
const std::string member_iteration("Iteration");
const std::string member_time_elapsed("Time_elapsed");
const std::string member_place_states("States");
const std::string member_fired_transition("Fired_transition");
typedef struct {
int simulation;
int iteration;
double time_elapsed;
char fired_transition[MAX_NAME_LENGTH];
int * place_states;
} SimulationData;
typedef struct {
int simulation;
int iteration;
double time_elapsed;
char fired_transition[MAX_NAME_LENGTH]; // MAX_NAME_LENGTH
hvl_t place_states; // N_PLACES
} SimulationData_buffer;
int main(void) {
// Data to write
SimulationData states_simulation[N_ROWS];
SimulationData_buffer states_simulation_buffer[N_ROWS];
// {
// { 1, 0, 0.0, {0, 0, 0}, "T1" },
// { 1, 1, 1.0, {0, 1, 0}, "T2" },
// { 1, 2, 5.0, {0, 0, 1}, "T1" }
// };
for (int i = 0; i< N_ROWS; i++) {
states_simulation[i].simulation = 1;
states_simulation[i].iteration = 0;
states_simulation[i].time_elapsed = 0.0;
// states_simulation[i].fired_transition = "T1";
strncpy(states_simulation[i].fired_transition, "T1",
sizeof(states_simulation[i].fired_transition) - 1);
states_simulation[i].fired_transition[sizeof(states_simulation[i].fired_transition) - 1] = 0;
states_simulation[i].place_states = new int[N_PLACES];
states_simulation[i].place_states[0] = 0;
states_simulation[i].place_states[1] = 10;
states_simulation[i].place_states[2] = 20;
}
// Number of rows
hsize_t dim[] = {sizeof(states_simulation) / sizeof(SimulationData)};
// Dimension of each row
int rank = sizeof(dim) / sizeof(hsize_t);
// defining the datatype to pass HDF5
H5::CompType mtype(sizeof(SimulationData_buffer));
mtype.insertMember(member_simulation,
HOFFSET(SimulationData, simulation),
H5::PredType::NATIVE_INT);
mtype.insertMember(member_iteration,
HOFFSET(SimulationData, iteration),
H5::PredType::NATIVE_INT);
mtype.insertMember(member_time_elapsed,
HOFFSET(SimulationData, time_elapsed),
H5::PredType::NATIVE_DOUBLE);
mtype.insertMember(member_fired_transition,
HOFFSET(SimulationData, fired_transition),
H5::StrType(H5::PredType::C_S1, MAX_NAME_LENGTH));
auto vlen_id_places = H5::VarLenType(H5::PredType::NATIVE_INT);
// Set different columns for the array <-------------------------
// auto offset = HOFFSET(SimulationData, place_states);
// for (int i = 0; i < N_PLACES; i++) {
// std::stringstream ss;
// ss << "Place_" << i+1;
// auto new_offset = offset + i*sizeof(int);
// std::cout << offset << " -> " << new_offset << std::endl;
// mtype.insertMember(ss.str(),
// new_offset,
// H5::PredType::NATIVE_INT);
// }
// Set the column as an array <-----------------------------------
mtype.insertMember("Places", HOFFSET(SimulationData, place_states), vlen_id_places);
// Filling buffer
for (int i = 0; i < N_ROWS; ++i) {
states_simulation_buffer[i].simulation = states_simulation[i].simulation;
states_simulation_buffer[i].iteration = states_simulation[i].iteration;
states_simulation_buffer[i].time_elapsed = states_simulation[i].time_elapsed;
strncpy(states_simulation_buffer[i].fired_transition,
states_simulation[i].fired_transition,
MAX_NAME_LENGTH);
states_simulation_buffer[i].place_states.len = N_PLACES;
states_simulation_buffer[i].place_states.p = states_simulation[i].place_states;
}
// preparation of a dataset and a file.
H5::DataSpace space(rank, dim);
H5::H5File *file = new H5::H5File(FileName, H5F_ACC_TRUNC);
H5::DataSet *dataset = new H5::DataSet(file->createDataSet(DatasetName,
mtype,
space));
H5::DataSet *dataset2 = new H5::DataSet(file->createDataSet("Prueba2",
mtype,
space));
// Write
dataset->write(states_simulation_buffer, mtype);
dataset2->write(states_simulation_buffer, mtype);
delete dataset;
delete file;
return 0;
}
可以用g++ h5-test-dynamic.cpp -lhdf5 -lhdf5_cpp -o h5-test-dynamic
.
如前所述,我需要每个值一列,而不是单列中的数组。我不知道为什么它不起作用,因为我已经hvl_t
正确设置了变量的指针和偏移量。如果我打开手动处理偏移量和数据类型的代码块并稍后立即关闭它,我会得到垃圾值。
这就是我得到的
[(1, 0, 0., b'T1', 3, 0, -971058832),
(1, 0, 0., b'T1', 3, 0, -971058800),
(1, 0, 0., b'T1', 3, 0, -971058768)]
这是我能得到的最好的
[(1, 0, 0., b'T1', array([ 0, 10, 20], dtype=int32)),
(1, 0, 0., b'T1', array([ 0, 10, 20], dtype=int32)),
(1, 0, 0., b'T1', array([ 0, 10, 20], dtype=int32))]
解决方案
推荐阅读
- javascript - 无法使用选择隐藏和显示字段
- swift - 用于复制、粘贴和剪切的 macOS SwiftUI TextEditor 键盘快捷键
- vba - 将 VBA 映射转换为 dart/flutter 映射
- javascript - 是否可以声明一个适用于数字和大整数的打字稿函数?
- reactjs - 在页面加载时立即运行一个函数
- python - 这个python函数的时间复杂度是多少?
- python - 在 AWS Lambda 上安装更新版本的 sqlite3
- url - 使用 octave 从 url 中提取和读取 grib2 文件
- blazor - 当有异步方法调用时,为什么 Blazor 生命周期方法会乱序执行?
- node.js - 我在导入 @google-cloud/storage 时遇到错误