netcdf4 - dask.array.compute() 失败并出现 RuntimeError: NetCDF: HDF error
问题描述
我通过 PBS 在 5 个节点上建立了一个 100 个核心的 Dask 集群。然后我使用 Xarray 的 open_mfdataset 读取了大约 1000 个 MODIS (hdf5) 切片。在连接数组以将所有时间步长(每个图块 92 个)连接在一起后,我尝试计算一个数据点 q 与所有其他点的欧几里德距离,并使用 argtok 获得 500 个最小的点。当我对这个包含 500 个结果的数组调用计算时,我得到: RuntimeError: NetCDF: HDF error
尝试使用不同的集群大小,并从 NFS 和 Lustre 读取文件
# create random sketch vectors with elements either + or - one
sketch_len = 10
rv = np.random.randint(2, size=(92,sketch_len))
rv = rv + (rv - 1)
rv_da = xr.DataArray(rv, dims=['time','rv'])
conus_tile_sketches = []
for ct in conus_tiles:
tile_ts=xr.open_mfdataset(tiles,concat_dim='time',mask_and_scale=False,
combine='nested',parallel=True)['500m 16 days NDVI']
tile_ts = tile_ts.transpose('y','x','time')
tile_ts = tile_ts.chunk((100,100,92))
tile_sketch = tile_ts.dot(rv_da)
tile_sketch = client.persist(tile_sketch)
conus_tile_sketches.append(tile_sketch)
flat_sketches = da.concatenate(conus_tile_sketches,axis=1)
flat_sketches = client.persist(flat_sketches)
q = flat_sketches[:,30123456]
q=q.reshape(10,1)
dist = da.linalg.norm(flat_sketches - q, axis = 0)
dist = client.persist(dist)
closest_idx = dist.argtopk(-501)
closest_idx=closest_idx.compute()
should return value of closest_idx array. Instead I get the following stack trace.
Note, my dist dask array is large:
>>> dist
dask.array<pow, shape=(63360000,), dtype=float64, chunksize=(19200,), chunktype=numpy.ndarray>
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/dask/base.py", line 175, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/dask/base.py", line 446, in compute
results = schedule(dsk, keys, **kwargs)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/distributed/client.py", line 2520, in get
results = self.gather(packed, asynchronous=asynchronous, direct=direct)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/distributed/client.py", line 1820, in gather
asynchronous=asynchronous,
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/distributed/client.py", line 754, in sync
self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/distributed/utils.py", line 337, in sync
raise exc.with_traceback(tb)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/distributed/utils.py", line 321, in f
result[0] = yield future
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/tornado/gen.py", line 735, in run
value = future.result()
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/distributed/client.py", line 1676, in _gather
raise exception.with_traceback(traceback)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/dask/array/core.py", line 108, in getter
c = np.asarray(c)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/xarray/core/indexing.py", line 452, in __array__
return np.asarray(self.array, dtype=dtype)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/xarray/core/indexing.py", line 610, in __array__
return np.asarray(self.array, dtype=dtype)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/xarray/core/indexing.py", line 516, in __array__
return np.asarray(array[self.key], dtype=None)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/xarray/conventions.py", line 42, in __getitem__
return np.asarray(self.array[key], dtype=self.dtype)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/xarray/core/indexing.py", line 516, in __array__
return np.asarray(array[self.key], dtype=None)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 70, in __getitem__
self._getitem)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/xarray/core/indexing.py", line 784, in explicit_indexing_adapter
result = raw_indexing_method(raw_key.tuple)
File "/home7/jcbecker/.conda/envs/geo/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 81, in _getitem
array = getitem(original_array, key)
File "netCDF4/_netCDF4.pyx", line 4351, in netCDF4._netCDF4.Variable.__getitem__
File "netCDF4/_netCDF4.pyx", line 5296, in netCDF4._netCDF4.Variable._get
File "netCDF4/_netCDF4.pyx", line 1857, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: HDF error
解决方案
通过使用更大的块大小来修复。在瓷砖阅读循环中,替换:
tile_ts = tile_ts.chunk((100,100,92))
和
tile_ts = tile_ts.chunk((1200,1200,92))
推荐阅读
- php - WooCommerce 中特定运输类别、最低金额和地理定位国家/地区的购物车消息
- terraform - 错误:设置新 vSphere SOAP 客户端时出错:发布“https://example.com/sdk”:拨号 tcp:i/o 超时
- c++ - FFMpeg 将 mp3 解码为采样缓冲区
- c# - 如何使用 HttpContext 获取控制器操作的 MethodInfo?(网络核心 2.2)
- angular - Angular中的相同服务-多个实例?或在每次调用中传递参数?
- jquery - 数据表不加载我 Jquery
- python - 被抓取和打印的数据只是每页的第一个条目,但我需要所有数据
- angular - 带有 Content-Type 的 Angular HTTP Post 请求:application/x-www-form-urlencoded 替换 '+' 用空格标记值
- sql - 在 SQL 数据库中查找列名的所有唯一值
- python - 不要在 Python 中计算矩阵 1xm 和 nx1(不是 NumPy)