首页 > 解决方案 > 将 tidy pandas 数据框更改为多索引枢轴

问题描述

鉴于以下整洁pd.DataFrame

import numpy as np
import pandas as pd

countries = np.tile(["A", "A", "A", "B", "B", "B", "C", "C", "C"], 2)
years = np.tile([2018, 2019, 2020], 6)
indicators = np.repeat(["ind_1", "ind_2"], 9)
vals = list(range(1, 19))

df = pd.DataFrame(data=[countries, years, indicators, vals]).T
df.columns = ["countries", "years", "indicators", "value"]
print(df)

   countries years indicators value
0          A  2018      ind_1     1
1          A  2019      ind_1     2
2          A  2020      ind_1     3
3          B  2018      ind_1     4
4          B  2019      ind_1     5
5          B  2020      ind_1     6
6          C  2018      ind_1     7
7          C  2019      ind_1     8
8          C  2020      ind_1     9
9          A  2018      ind_2    10
10         A  2019      ind_2    11
11         A  2020      ind_2    12
12         B  2018      ind_2    13
13         B  2019      ind_2    14
14         B  2020      ind_2    15
15         C  2018      ind_2    16
16         C  2019      ind_2    17
17         C  2020      ind_2    18

我正在寻找一种 pyhtonic 方式来进行 pivot df。列索引应该有两个级别:["indicators", "years"]. 也就是说,结果应该是这样的:

indicators ind_1           ind_2          
years       2018 2019 2020  2018 2019 2020
countries                                 
A              1    2    3    10   11   12
B              4    5    6    13   14   15
C              7    8    9    16   17   18

下面的代码会做到这一点。不过,感觉有点奇怪。有没有更复杂的方法来做到这一点?

df = df.set_index(["indicators", "years"]).pivot(columns="countries").T
df = df.droplevel(level=0, axis=0)

标签: pythonpandaspivotmulti-index

解决方案


推荐阅读