首页 > 解决方案 > 当索引和列都是多索引时重置索引

问题描述

我有一个在行和列中都有多索引的 df,我想在索引和列上都重置索引,以便所有多列索引都是新列。下面是我所拥有和想要的示例。

我有的:

indexAarrays = [['bar', 'bar', 'baz', 'baz', ],
          ['one', 'two', 'one', 'two']]
indexTuples = list(zip(*indexAarrays))
index =  pd.MultiIndex.from_tuples(indexTuples, names=['firstIndex', 'secondIndex'])

colAarrays = [['c1', 'c1', 'c2', 'c2', ],
          ['d1', 'd2', 'd1', 'd2']]
colTuples = list(zip(*colAarrays ))
col =  pd.MultiIndex.from_tuples(colTuples, names=['firstCol', 'secondCol'])

df = pd.DataFrame(data=np.random.random_sample((len(index), len(col))), 
                  index=index, columns=col)
df

以上给出了我拥有的DF:

firstCol                      c1                  c2          
secondCol                     d1        d2        d1        d2
firstIndex secondIndex                                        
bar        one          0.231221  0.846196  0.037493  0.516474
           two          0.810847  0.204095  0.423766  0.513262
baz        one          0.433040  0.118018  0.267039  0.356261
           two          0.529042  0.181886  0.093488  0.643357

我想要的是:

wantedCols = [idxName for idxName in index.names] \
                        + [colName for colName in col.names]\
                        + ['Value']
dfWanted = pd.DataFrame(index = range(int(df.shape[0]*df.shape[1]/(len(wantedCols)-1))),
                        columns=wantedCols)

idxCounter = 0
for idx1 in df.index.get_level_values(0).unique():
    for idx2 in df.index.get_level_values(1).unique():
        for c1 in df.columns.get_level_values(0).unique():
            for c2 in df.columns.get_level_values(1).unique():
                dfWanted.loc[idxCounter, 'firstIndex'] = idx1
                dfWanted.loc[idxCounter, 'secondIndex'] = idx2
                dfWanted.loc[idxCounter, 'firstCol'] = c1
                dfWanted.loc[idxCounter, 'secondCol'] = c2
                dfWanted.loc[idxCounter, 'Value'] = df.loc[(idx1, idx2), (c1, c2)]
                idxCounter += 1

dfWanted     

以上给出了我想要的DF:

firstIndex secondIndex firstCol secondCol      Value
0         bar         one       c1        d1   0.231221
1         bar         one       c1        d2   0.846196
2         bar         one       c2        d1   0.037493
3         bar         one       c2        d2   0.516474
4         bar         two       c1        d1   0.810847
5         bar         two       c1        d2   0.204095
6         bar         two       c2        d1   0.423766
7         bar         two       c2        d2   0.513262
8         baz         one       c1        d1    0.43304
9         baz         one       c1        d2   0.118018
10        baz         one       c2        d1   0.267039
11        baz         one       c2        d2   0.356261
12        baz         two       c1        d1   0.529042
13        baz         two       c1        d2   0.181886
14        baz         two       c2        d1  0.0934878
15        baz         two       c2        d2   0.643357    

有谁知道比我上面使用的方法更简单的方法来重置索引?

标签: pandasmulti-index

解决方案


由两个级别使用DataFrame.stack,然后MultiIndex Series通过以下方式转换为列Series.reset_index

df = df.stack([0,1]).reset_index(name='Value')
print (df)
   firstIndex secondIndex firstCol secondCol     Value
0         bar         one       c1        d1  0.746027
1         bar         one       c1        d2  0.622784
2         bar         one       c2        d1  0.613197
3         bar         one       c2        d2  0.449560
4         bar         two       c1        d1  0.560810
5         bar         two       c1        d2  0.125046
6         bar         two       c2        d1  0.147148
7         bar         two       c2        d2  0.622862
8         baz         one       c1        d1  0.537280
9         baz         one       c1        d2  0.801410
10        baz         one       c2        d1  0.889445
11        baz         one       c2        d2  0.226477
12        baz         two       c1        d1  0.100759
13        baz         two       c1        d2  0.279383
14        baz         two       c2        d1  0.041767
15        baz         two       c2        d2  0.739942

推荐阅读