首页 > 解决方案 > Xarray drop sel 与 MultiIndex

问题描述

我想计算气候数据的异常。代码如下所示:

import pandas as pd
import numpy as np
import xarray as xr
date = pd.date_range('2000-01-01','2010-12-31') #4018 days
data = np.random.rand(len(date))
da = xr.DataArray(data=data,
                  dims='date',
                  coords=dict(date=date))
monthday = pd.MultiIndex.from_arrays([da['date.month'].values, da['date.day'].values])
da = da.assign_coords(monthday=('date',monthday)).groupby('monthday').mean(dim='date')
print(da)


<xarray.DataArray (monthday: 366)>
array([0.38151556, 0.46306277, 0.46148326, 0.35894069, 0.48318011,
       0.44736969, 0.46828286, 0.44927365, 0.59294693, 0.61940206,
       0.54264219, 0.51797117, 0.46200014, 0.50356122, 0.49371135,
       ...
       0.44668478, 0.32583885, 0.36537256, 0.64087588, 0.56546472,
       0.5021695 , 0.42450777, 0.49071572, 0.39639316, 0.53538823,
       0.48345995, 0.46290486, 0.75160507, 0.4945804 , 0.52283262,
       0.45320128])
Coordinates:
  * monthday          (monthday) MultiIndex
  - monthday_level_0  (monthday) int64 1 1 1 1 1 1 1 1 ... 12 12 12 12 12 12 12
  - monthday_level_1  (monthday) int64 1 2 3 4 5 6 7 8 ... 25 26 27 28 29 30 31

月份包含 (2,29),即闰日。那么我怎样才能放弃闰日。我试过了,但它似乎错了

da.drop_sel(monthday=(2,29))
Traceback (most recent call last):
  File "/Users/osamuyuubu/anaconda3/envs/xesmf_env/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-65-caf7267f29a4>", line 11, in <module>
    da.drop_sel(monthday=(2,29))
  File "/Users/osamuyuubu/anaconda3/envs/xesmf_env/lib/python3.7/site-packages/xarray/core/dataarray.py", line 2374, in drop_sel
    ds = self._to_temp_dataset().drop_sel(labels, errors=errors)
  File "/Users/osamuyuubu/anaconda3/envs/xesmf_env/lib/python3.7/site-packages/xarray/core/dataset.py", line 4457, in drop_sel
    new_index = index.drop(labels_for_dim, errors=errors)
  File "/Users/osamuyuubu/anaconda3/envs/xesmf_env/lib/python3.7/site-packages/pandas/core/indexes/multi.py", line 2201, in drop
    loc = self.get_loc(level_codes)
  File "/Users/osamuyuubu/anaconda3/envs/xesmf_env/lib/python3.7/site-packages/pandas/core/indexes/multi.py", line 2922, in get_loc
    loc = self._get_level_indexer(key, level=0)
  File "/Users/osamuyuubu/anaconda3/envs/xesmf_env/lib/python3.7/site-packages/pandas/core/indexes/multi.py", line 3204, in _get_level_indexer
    idx = self._get_loc_single_level_index(level_index, key)
  File "/Users/osamuyuubu/anaconda3/envs/xesmf_env/lib/python3.7/site-packages/pandas/core/indexes/multi.py", line 2855, in _get_loc_single_level_index
    return level_index.get_loc(key)
  File "/Users/osamuyuubu/anaconda3/envs/xesmf_env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
    raise KeyError(key) from err
KeyError: 29

那么,我怎样才能使用 xr.drop_sel() 来实现呢?

提前致谢!

标签: pythonpandaspandas-groupbypython-xarray

解决方案


drop_sel您需要在索引中给出确切的值:

da.drop_sel(dayofyear=60)

但对于非闰年,这将下降 3 月 1 日。

为了在 2 月 29 日安全降落,我可能会使用以下内容:

mask = np.logical_and(da.time.dt.is_leap_year, da.time.dt.dayofyear==60)
result = da.where(~mask, drop=True)

推荐阅读