首页 > 解决方案 > 如何根据特定规则标记系列中的值?

问题描述

我想在我的意甲中找到积极和消极的浪潮。那么如何标记数据呢?
示例
我的数据:

| date                | value |
|---------------------|-------|
| 2018-09-06 00:00:03 | 0     |
| 2018-09-06 00:00:04 | 0     |
| 2018-09-06 00:00:05 | 1     |
| 2018-09-06 00:00:06 | 1     |
| 2018-09-06 00:00:07 | 2     |
| 2018-09-06 00:00:08 | -1    |
| 2018-09-06 00:00:09 | -5    |
| 2018-09-06 00:00:10 | 0     |
| 2018-09-06 00:00:11 | -6    |
| 2018-09-06 00:00:12 | 2     |
| 2018-09-06 00:00:13 | 0     |
| 2018-09-06 00:00:14 | 4     |

我想要的结果:

| date                | value | sign |
|---------------------|-------|------|
| 2018-09-06 00:00:03 | 0     | 1    |
| 2018-09-06 00:00:04 | 0     | 1    |
| 2018-09-06 00:00:05 | 1     | 1    |
| 2018-09-06 00:00:06 | 1     | 1    |
| 2018-09-06 00:00:07 | 2     | 1    |
| 2018-09-06 00:00:08 | -1    | 2    |
| 2018-09-06 00:00:09 | -5    | 2    |
| 2018-09-06 00:00:10 | 0     | 2    |
| 2018-09-06 00:00:11 | -6    | 2    |
| 2018-09-06 00:00:12 | 2     | 3    |
| 2018-09-06 00:00:13 | 0     | 3    |
| 2018-09-06 00:00:14 | 4     | 3    |

接着:

mydata.groupby(['sign']).transform('sum')

标签: pythonpandasdataframe

解决方案


您的样本数据不包括正负波由零分隔的情况,例如1 0 0 -1。这是涵盖该案例的解决方案:

# mask the zeros
s = mydata['value'].eq(0)

# merge the zeros to the wave after them
m = np.sign(mydata['value']).mask(s).bfill()

# result
mydata['sign'] = m.diff().ne(0).cumsum()

推荐阅读