首页 > 解决方案 > 根据日期列取列到行

问题描述

我有一个看起来像的数据框:

df1:

+-------------+-------------------+------------+
| date        | status            | counts     |
+-------------+-------------------+------------+
| 2020-03-02  |  death            |  0         |
| nan.        |  positive         |  5         |
| nan.        |  recovery         |  0         |
| nan.        |  positive cum     |  5         |
| nan.        |  recovery cum     |  0         |
| 2020-03-03  |  death            |  0         |
| nan.        |  positive         |  10        |
| nan.        |  recovery         |  0         |
| nan.        |  positive cum     |  15        |
| nan.        |  recovery cum     |  0         |
+-------------+-------------------+------------+

我想像这样旋转数据框以可视化表格:

+-------------+---------+------------+----------+---------------+---------------+
| date        | death   | positive   | recovery |  positive cum | recovery cum. |
+-------------+---------+------------+----------+---------------+---------------+
| 2020-03-02  |  0      |  5         | 0        | 5             | 0             |
| 2020-03-03  |  0      |  10        | 0        | 15            | 0             |
+-------------+---------+------------+----------+---------------+---------------+

我试过了:

pd.pivot_table(df, index=['date'], columns=['status'], values=['counts'], aggfunc='sum')

但结果只取非nan日期的行。请指教

标签: pythonpandasdataframe

解决方案


首先列中ffillNaN值,date然后使用pivot_tablewithaggfunc=first来重塑数据框:

pvt = df.assign(date=df['date'].ffill())\
        .pivot_table(index='date', columns='status', values='counts', aggfunc='first')

或者,如果没有status对应于特定的重复值,date您可以改用pivot

pvt = df.assign(date=df['date'].ffill()).pivot('date', 'status', 'counts')

status      death  positive  positive cum  recovery  recovery cum
date                                                             
2020-03-02      0         5             5         0             0
2020-03-03      0        10            15         0             0

推荐阅读