python - 在所有第一级列上过滤 Pandas MultiIndex
问题描述
试图找到一种方法,根据仅为一个顶级列定义的过滤器有效地过滤两个顶级列下的所有条目。最好用下面的例子和所需的输出来解释。
示例数据框
import pandas as pd
import numpy as np
info = ['price', 'year']
months = ['month0','month1','month2']
settlement_dates = ['2020-12-31', '2021-01-01']
Data = [[[2,4,5],[2020,2021,2022]],[[1,4,2],[2021,2022,2023]]]
Data = np.array(Data).reshape(len(settlement_date),len(months) * len(info))
midx = pd.MultiIndex.from_product([assets, Asset_feature])
df = pd.DataFrame(Data, index=settlement_dates, columns=midx)
df
price year
month0 month1 month2 month0 month1 month2
2020-12-31 2 4 5 2020 2021 2022
2021-01-01 1 4 2 2021 2022 2023
为多索引数据框创建过滤器
idx_cols = pd.IndexSlice
df_filter = df.loc[:, idx_cols['year', :]]==2021
df[df_filter]
price year
month0 month1 month2 month0 month1 month2
2020-12-31 NaN NaN NaN NaN 2021.0 NaN
2021-01-01 NaN NaN NaN 2021.0 NaN NaN
期望的输出:
price year
month0 month1 month2 month0 month1 month2
2020-12-31 NaN 4 NaN NaN 2021.0 NaN
2021-01-01 1 NaN NaN 2021.0 NaN NaN
解决方案
您可以通过 reshape for DataFrame
byDataFrame.stack
和 filter by来重塑简化解决方案DataFrame.where
:
df1 = df.stack()
df_filter = df1['year']==2021
df_filter = df1.where(df_filter).unstack()
print (df_filter)
price year
month0 month1 month2 month0 month1 month2
2020-12-31 NaN 4.0 NaN NaN 2021.0 NaN
2021-01-01 1.0 NaN NaN 2021.0 NaN NaN
您的解决方案是可能的,但更复杂 - 通过向后和向前填充缺失值来重新塑造缺失值的掩码:
idx_cols = pd.IndexSlice
df_filter = df.loc[:, idx_cols['year', :]]==2021
df_filter = df_filter.reindex(df.columns, axis=1).stack(dropna=False).bfill(axis=1).ffill(axis=1).unstack()
print (df_filter)
price year
month0 month1 month2 month0 month1 month2
2020-12-31 False True False False True False
2021-01-01 True False False True False False
print (df[df_filter])
price year
month0 month1 month2 month0 month1 month2
2020-12-31 NaN 4.0 NaN NaN 2021.0 NaN
2021-01-01 1.0 NaN NaN 2021.0 NaN NaN
推荐阅读
- tensorflow - MirrorStrategy 没有看到 GPU - INFO:tensorflow:Not using Distribute Coordinator
- eclipse - 带有 treeViewer 的 Eclipse 富客户端平台
- typescript - 即使安装了 redux-thunk,调度仍然期待一个动作?
- react-native - 它显示导入声明中的所有导入都未在本机反应中使用
- mysql - 需要对索引进行一些澄清(WHERE、JOIN)
- python - 预期的缩进块 - 行延续?
- excel - 基于单元格范围内最大值的条件格式
- powerbi - Power BI / DAX 中的 COUNTIF 等效项?
- ios - 从 GET 请求中获取图像并存储它
- c - 搜索二进制数