首页 > 解决方案 > 根据不同条件填充df的NaN值

问题描述

我有一个这样的df:

Timestamp                                 Time  Power    Total Energy              ID     Energy
2020-04-09 06:45:00 2020-04-09 04:44:40.559719   7500       5636690.0               1      140.0    
2020-04-09 06:46:00 2020-04-09 04:44:40.559719   7500       5636710.0               1      160.0    
2020-04-09 06:47:00                        NaT    NaN             NaN             NaN        NaN    
2020-04-09 06:48:00 2020-04-09 04:44:40.559719   7500       5636960.0               1      410.0
2020-04-09 06:49:00                        NaT    NaN             NaN             NaN        NaN
2020-04-09 06:50:00                        NaT    NaN             NaN             NaN        NaN
2020-04-09 06:51:00                        NaT    NaN             NaN             NaN        NaN
...                                        ...    ...             ...             ...        ...
2020-04-30 23:55:00 2020-04-29 16:30:38.559871   7500      18569270.0               5      100.0
2020-04-30 23:54:00                        NaT    NaN             NaN             NaN        NaN
2020-04-30 23:55:00 2020-04-29 16:30:38.559871   7500      18569370.0               5      180.0

不同的循环(df['ID'])由不同的 ID 标记。在一个周期内(ID 出现在 nan 值之前和之后)应该平均两条“周围”线的功率,ID 和 Time 应该继续,并且在列能量中应该输入列能量的最后一个现有值。在循环之外(ID 之前!= 下一个 ID),功率和能量应设置为 0,ID/时间列应设置为“-”。对于列总能量,值应该简单地继续。

预期结果:

Timestamp                                 Time  Power    Total Energy              ID     Energy
2020-04-09 06:45:00 2020-04-09 04:44:40.559719   7500       5636690.0               1      140.0    
2020-04-09 06:46:00 2020-04-09 04:44:40.559719   7500       5636710.0               1      160.0    
2020-04-09 06:47:00 2020-04-09 04:44:40.559719   7500       5636710.0               1      160.0
2020-04-09 06:48:00 2020-04-09 04:44:40.559719   7500       5636960.0               1      410.0
2020-04-09 06:49:00                          -      0       5636960.0               -          0
2020-04-09 06:50:00                          -      0       5636960.0               -          0
2020-04-09 06:51:00                          -      0       5636960.0               -          0
...                                        ...    ...             ...             ...        ...
2020-04-30 23:55:00 2020-04-29 16:30:38.559871   7500      18569270.0               5      100.0
2020-04-30 23:54:00 2020-04-29 16:30:38.559871   7500      18569270.0               5      100.0
2020-04-30 23:55:00 2020-04-29 16:30:38.559871   7500      18569370.0               5      180.0

标签: pythonpandasnan

解决方案


推荐阅读