首页 > 解决方案 > 如何遍历 hdf5 文件中的所有键和值并确定哪些包含数据?

问题描述

我将模型模拟的结果存储在 hdf5 文件 (.hdf) 中。

我知道如何使用 h5py 模块打开文件并仔细阅读数据。

问题是,嵌套的键和数据集太多了,要真正找到所有的键和数据集并确定其中实际有数据是一件非常痛苦的事情。

这是我目前正在处理的:

import h5py
f = h5py.File('results.hdf') #to read the file

k1 = f.keys() #shows the keys in the first level

k1
<KeysViewHDF5 ['Event Conditions', 'Geometry', 'Plan Data', 'Results']>

现在,要查看存储的所有数据,我可以执行以下操作:

for k1 in f:
    for k2 in f[k1].keys():
        for k3 in f[k1][k2].keys():
            print(f[k1][k2][k3])  

<HDF5 group "/Event Conditions/Unsteady/Boundary Conditions" (2 members)>
<HDF5 group "/Event Conditions/Unsteady/Initial Conditions" (0 members)>
<HDF5 dataset "Attributes": shape (350,), type "|V45">
<HDF5 dataset "Polyline Info": shape (350, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (350, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (3598, 2), type "<f8">
<HDF5 dataset "Attributes": shape (3,), type "|V37">
<HDF5 dataset "Polygon Info": shape (3, 4), type "<i4">
<HDF5 dataset "Polygon Parts": shape (3, 2), type "<i4">
<HDF5 dataset "Polygon Points": shape (344, 2), type "<f8">
<HDF5 dataset "Attributes": shape (1,), type "|V64">
<HDF5 dataset "Cell Info": shape (1, 2), type "<i4">
<HDF5 dataset "Cell Points": shape (586635, 2), type "<f8">
<HDF5 group "/Geometry/2D Flow Areas/Delta" (0 members)>
<HDF5 group "/Geometry/2D Flow Areas/Perimeter 1" (25 members)>
<HDF5 dataset "Polygon Info": shape (1, 4), type "<i4">
<HDF5 dataset "Polygon Parts": shape (1, 2), type "<i4">
<HDF5 dataset "Polygon Points": shape (610, 2), type "<f8">
<HDF5 dataset "Attributes": shape (1,), type "|V60">
<HDF5 dataset "External Faces": shape (177,), type "|V24">
<HDF5 dataset "Polyline Info": shape (1, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (1, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (5, 2), type "<f8">
<HDF5 dataset "TIN Info": shape (347, 4), type "<i4">
<HDF5 dataset "TIN Points": shape (13591, 4), type "<f8">
<HDF5 dataset "TIN Triangles": shape (20008, 3), type "<i4">
<HDF5 dataset "XSIDs": shape (347, 2), type "<i4">
<HDF5 dataset "Attributes": shape (348,), type "|V676">
<HDF5 group "/Geometry/Cross Sections/Flow Distribution" (5 members)>
<HDF5 dataset "Manning's n Info": shape (348, 2), type "<i4">
<HDF5 dataset "Manning's n Values": shape (1044, 2), type "<f4">
<HDF5 dataset "Polyline Info": shape (348, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (348, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (696, 2), type "<f8">
<HDF5 dataset "Station Elevation Info": shape (348, 2), type "<i4">
<HDF5 dataset "Station Elevation Values": shape (151973, 2), type "<f4">
<HDF5 dataset "Attributes": shape (41,), type "|V32">
<HDF5 dataset "Calibration Table": shape (2,), type "|V200">
<HDF5 dataset "Polygon Info": shape (41, 4), type "<i4">
<HDF5 dataset "Polygon Parts": shape (41, 2), type "<i4">
<HDF5 dataset "Polygon Points": shape (45442, 2), type "<f8">
<HDF5 dataset "Polyline Info": shape (2, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (2, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (1768, 2), type "<f8">
<HDF5 dataset "Attributes": shape (1,), type "|V96">
<HDF5 dataset "Polyline Info": shape (1, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (1, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (2042, 2), type "<f8">
<HDF5 dataset "Polyline Info": shape (2, 4), type "<i4">
<HDF5 dataset "Polyline Parts": shape (2, 2), type "<i4">
<HDF5 dataset "Polyline Points": shape (1152, 2), type "<f8">
<HDF5 dataset "Attributes": shape (1,), type "|V253">
<HDF5 dataset "Centerline Info": shape (1, 4), type "<i4">
<HDF5 dataset "Centerline Parts": shape (1, 2), type "<i4">
<HDF5 dataset "Centerline Points": shape (48, 2), type "<f8">
<HDF5 dataset "Profiles": shape (500,), type "|V28">
<HDF5 dataset "Compute Messages (rtf)": shape (1,), type "|S293107">
<HDF5 dataset "Compute Messages (text)": shape (1,), type "|S215682">
<HDF5 dataset "Compute Processes": shape (6,), type "|V332">
<HDF5 group "/Results/Unsteady/Geometry Info" (3 members)>
<HDF5 group "/Results/Unsteady/Output" (1 members)>
<HDF5 group "/Results/Unsteady/Summary" (0 members)>

但是如果我继续这样做,首先它开始变得荒谬并且显然有一种更清洁的方法,其次它开始崩溃,因为一些键只下降了一定数量的级别。

我想知道 hdf 文件中数据的所有可能键/路径,以及它们是否包含数据(有些不包含)。

可能是某种带有 try/except 的循环来处理路径的结尾?

如果你知道怎么做,请帮助任何人!

谢谢。

标签: pythonhdf5h5py

解决方案


这里和文档链接是这个http://docs.h5py.org/en/latest/high/group.html#Group.visit

def print_attrs(name, obj):
    print(name)
    for key, val in obj.attrs.items():
        print("    %s: %s" % (key, val))

f = h5py.File('foo.hdf5', 'r')
f.visititems(print_attrs)

它使用委托模式。您需要传递一个可调用对象,h5py并将使用名称和对象值调用它。在您的可调用对象中,您可以检查并决定要做什么。


推荐阅读