首页 > 解决方案 > 切片索引 Dask 数据帧

问题描述

有没有一种简单的方法来切片 Dask 数据帧索引:

Pandas 中有类似的东西吗?

index_element = df.index[-1]

标签: pythondaskdask-dataframe

解决方案


你追求什么?

在dask数据帧上做.index[i]将给出

import dask.dataframe as dd
df = dd.demo.make_timeseries(
    start="2000-01-01",
    end="2000-01-03",
    dtypes={"id": int, "z": int},
    freq="1h",
    partition_freq="24h",
)

df.index[-1]

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-7-d70d3c1197c1> in <module>
----> 1 df.index[-1]

~/miniconda/envs/main/lib/python3.8/site-packages/dask/dataframe/core.py in __getitem__(self, key)
   3172             graph = HighLevelGraph.from_collections(name, dsk, dependencies=[self, key])
   3173             return Series(graph, name, self._meta, self.divisions)
-> 3174         raise NotImplementedError(
   3175             "Series getitem in only supported for other series objects "
   3176             "with matching partition structure"

NotImplementedError: Series getitem in only supported for other series objects with matching partition structure

如果您在最后一行的索引之后,您可以这样做:

df.tail(1).index

DatetimeIndex(['2000-01-02 23:00:00'], dtype='datetime64[ns]', name='timestamp', freq='H')

推荐阅读