python - 如何根据特定规则标记系列中的值?
问题描述
我想在我的意甲中找到积极和消极的浪潮。那么如何标记数据呢?
示例
我的数据:
| date | value |
|---------------------|-------|
| 2018-09-06 00:00:03 | 0 |
| 2018-09-06 00:00:04 | 0 |
| 2018-09-06 00:00:05 | 1 |
| 2018-09-06 00:00:06 | 1 |
| 2018-09-06 00:00:07 | 2 |
| 2018-09-06 00:00:08 | -1 |
| 2018-09-06 00:00:09 | -5 |
| 2018-09-06 00:00:10 | 0 |
| 2018-09-06 00:00:11 | -6 |
| 2018-09-06 00:00:12 | 2 |
| 2018-09-06 00:00:13 | 0 |
| 2018-09-06 00:00:14 | 4 |
我想要的结果:
| date | value | sign |
|---------------------|-------|------|
| 2018-09-06 00:00:03 | 0 | 1 |
| 2018-09-06 00:00:04 | 0 | 1 |
| 2018-09-06 00:00:05 | 1 | 1 |
| 2018-09-06 00:00:06 | 1 | 1 |
| 2018-09-06 00:00:07 | 2 | 1 |
| 2018-09-06 00:00:08 | -1 | 2 |
| 2018-09-06 00:00:09 | -5 | 2 |
| 2018-09-06 00:00:10 | 0 | 2 |
| 2018-09-06 00:00:11 | -6 | 2 |
| 2018-09-06 00:00:12 | 2 | 3 |
| 2018-09-06 00:00:13 | 0 | 3 |
| 2018-09-06 00:00:14 | 4 | 3 |
接着:
mydata.groupby(['sign']).transform('sum')
解决方案
您的样本数据不包括正负波由零分隔的情况,例如1 0 0 -1
。这是涵盖该案例的解决方案:
# mask the zeros
s = mydata['value'].eq(0)
# merge the zeros to the wave after them
m = np.sign(mydata['value']).mask(s).bfill()
# result
mydata['sign'] = m.diff().ne(0).cumsum()