首页 > 解决方案 > Reindex Bins-index DataFrame

问题描述

我想为应用 groupby 后未出现在索引/列中的 bin 添加行和列为零:

import numpy as np
import pandas as pd

bins = np.arange(-0.1, 2, 0.1)

names = np.random.random_integers(0, 100, 1000)
a = np.random.random(1000)
b = np.random.random(1000)

matrix = pd.DataFrame([names, pd.cut(a, bins), pd.cut(b, bins)]).T
matrix.columns = ['names', 'a', 'b']

matrix = matrix.groupby(['a', 'b']).count()
matrix.reset_index(inplace=True)

matrix = matrix.pivot(index='a', columns='b', values='names').fillna(0)

标签: pythonpandaspandas-groupby

解决方案


pandas.cut方法的输出分配给变量以访问categories属性:

bins = np.arange(-0.1, 2, 0.1)

names = np.random.random_integers(0, 100, 1000)
a = np.random.random(1000)
b = np.random.random(1000)


##############################
# Use pd.cut like this
a_bins = pd.cut(a, bins)
b_bins = pd.cut(b, bins)
##############################


matrix = pd.DataFrame([names, a_bins, b_bins]).T


matrix.columns = ['names', 'a', 'b']

matrix = matrix.groupby(['a', 'b']).count()
matrix.reset_index(inplace=True)

matrix = matrix.pivot(index='a', columns='b', values='names').fillna(0)

##################################################
# Reindex with this
matrix = matrix.reindex(index=a_bins.categories,
                        columns=b_bins.categories,
                        fill_value=0)
##################################################

推荐阅读