首页 > 解决方案 > 拆分列并合并有多个数据度量的行

问题描述

我正在尝试使用 python 来解决我的数据分析问题。我有一张这样的桌子:

+----------+-----+------+--------+-------------+--------------+
|       ID | QTR | Year | MEF_ID | Qtr_Measure | Value_column |
+----------+-----+------+--------+-------------+--------------+
|       11 |   1 | 2020 | Name1  | QTRAVG      |            5 |
|       11 |   2 | 2020 | Name1  | QTRAVG      |            8 |
|       11 |   3 | 2020 | Name1  | QTRAVG      |            6 |
|       11 |   4 | 2020 | Name1  | QTRAVG      |            9 |
|       15 |   1 | 2020 | Name2  | QTRAVG      |           67 |
|       15 |   2 | 2020 | Name2  | QTRAVG      |           89 |
|       15 |   3 | 2020 | Name2  | QTRAVG      |          100 |
|       15 |   4 | 2020 | Name2  | QTRAVG      |          121 |
|       11 |   1 | 2020 | Name1  | QTRMAX      |            6 |
|       11 |   2 | 2020 | Name1  | QTRMAX      |            9 |
|       11 |   3 | 2020 | Name1  | QTRMAX      |            7 |
|       11 |   4 | 2020 | Name1  | QTRMAX      |           10 |
+----------+-----+------+--------+-------------+--------------+

我想以一种可以捕获唯一 ID 和 MEF_ID 的多个 Qtr_measures 的方式排列 Value_column。执行此操作时,表的整体大小将减小,我希望将 Qtr_Measures 列替换为以下类型:

+----------+-----+------+--------+-------------+--------+--------+
|       ID | QTR | Year | MEF_ID | Qtr_Measure | QTRAVG | QTRMAX |
+----------+-----+------+--------+-------------+--------+--------+
|       11 |   1 | 2020 | Name1  | QTRAVG      |      5 |      6 |
|       11 |   2 | 2020 | Name1  | QTRAVG      |      8 |      9 |
|       11 |   3 | 2020 | Name1  | QTRAVG      |      6 |      7 |
|       11 |   4 | 2020 | Name1  | QTRAVG      |      9 |     10 |
|       15 |   1 | 2020 | Name2  | QTRAVG      |     67 |        |
|       15 |   2 | 2020 | Name2  | QTRAVG      |     89 |        |
|       15 |   3 | 2020 | Name2  | QTRAVG      |    100 |        |
|       15 |   4 | 2020 | Name2  | QTRAVG      |    121 |        |
+----------+-----+------+--------+-------------+--------+--------+

我怎么能用python做到这一点?

谢谢

标签: pythonpandas

解决方案


pivot_tablereset_index和一起使用rename_axis

piv = (df.pivot_table(index=['ID', 'QTR', 'Year', 'MEF_ID'], 
                      values='Value_column', 
                      columns='Qtr_Measure')
       .reset_index()
       .rename_axis(None, axis=1)
      )

print(piv)
   ID  QTR  Year MEF_ID  QTRAVG  QTRMAX
0  11    1  2020  Name1     5.0     6.0
1  11    2  2020  Name1     8.0     9.0
2  11    3  2020  Name1     6.0     7.0
3  11    4  2020  Name1     9.0    10.0
4  15    1  2020  Name2    67.0     NaN
5  15    2  2020  Name2    89.0     NaN
6  15    3  2020  Name2   100.0     NaN
7  15    4  2020  Name2   121.0     NaN

推荐阅读