首页 > 解决方案 > 如何使用 value_counts 为一列与另一列派生新列

问题描述

我有包含许多列的数据框。

df2



   TargetDescription                               Output_media_duration
0   VMN 4.0 16x9 25 - 1920x1080, 1280x720, 960x540...    NaN
1   VMN 4.0 16x9 25 - 1920x1080, 1280x720, 960x540...    NaN
2   XDCAM HD NTSC 1920x1080 MXF 8CA                      661.120000
3   VMN 4.0 16x9 29.97 - 1920x1080, 1280x720, 960x...   285.647686
4   VMN 4.0 16x9 29.97 - 1920x1080, 1280x720, 960x...   402.697303
5   VMN 4.0 16x9 29.97 - 1920x1080, 1280x720, 960x...   269.597070
6   VMN 4.0 16x9 29.97 - 1920x1080, 1280x720, 960x...   307.059607
7   Caption QC HD MOV 2CA                               2516.096917
8   QT Proxy 640x360 2997 12CA                          NaN
9   XDCAM HD NTSC 1920x1080 MXF 8CA                     1414.785215
10  Caption QC HD MOV 2CA                               1295.027067
11  QT Proxy 640x360 2398 4CA                           2524.980792
12  Caption QC HD MOV 2CA                               120.820700
13  Caption QC HD MOV 2CA                               2516.096917

现在我想得到一个新的数据框,它会像这样向我展示

TargetDescription                                                     format_duration
1   VMN 4.0 16x9 25 - 1920x1080, 1280x720, 960x540...                       NaN
2   XDCAM HD NTSC 1920x1080 MXF 8CA                                         661.120000
3   VMN 4.0 16x9 29.97 - 1920x1080, 1280x720, 960x...                       1656.561906 
4   Caption QC HD MOV 2CA                                                   2516.096917
5   QT Proxy 640x360 2997 12CA                                              NaN
6   Caption QC HD MOV 2CA                                                   2636.917

我如何在熊猫中实现这一点,在此先感谢

标签: pythonpandasdata-science

解决方案


df.groupby('TargetDescription')['Output_media_duration'].sum().reset_index(name ='format_duration')

推荐阅读