python - python在数据框中移动数据
问题描述
使用以下数据框,我想转移数据。按“ID”和“系列”分组,Q、R 和 T 列中的数据应下移到“状态”为“结束”的行。
data = pd.DataFrame({
'ID': ['A','A','A','B','B','B','B','C','C','C','C','C','D','D'],
'Series': [1,1,1,1,1,2,2,1,1,2,2,2,1,1],
'Status': ['Begin','Begin','End','Begin','End','Begin','End','Begin','End','Begin','Begin','End','Begin','End'],
'Q':[9,'','',30,'',14,'',3,'',17,'','',1,''],
'R': ['',8,'','','','','','','','',7,'','',''],
'T': ['','',12,'',38,'',21,'',6,'','',35,'',5]
})
结果应如下所示:
result = pd.DataFrame({
'ID': ['A','A','A','B','B','B','B','C','C','C','C','C','D','D'],
'Series': [1,1,1,1,1,2,2,1,1,2,2,2,1,1],
'Status': ['Begin','Begin','End','Begin','End','Begin','End','Begin','End','Begin','Begin','End','Begin','End'],
'Q':['','',9,'',30,'',14,'',3,'','',17,'',1],
'R': ['','',8,'','','','','','','','',7,'',''],
'T': ['','',12,'',38,'',21,'',6,'','',35,'',5]
})
解决方案
使用GroupBy.transform
+GroupBy.first
查找第一个非NaN
s 值,然后通过mask
and删除重复值duplicated
:
cols = ['Q', 'R', 'T']
#repalce emty strings to NaNs
data[cols] = data[cols].astype(str).replace('', np.nan)
print (data)
ID Series Status Q R T
0 A 1 Begin 9 NaN NaN
1 A 1 Begin NaN 8 NaN
2 A 1 End NaN NaN 12
3 B 1 Begin 30 NaN NaN
4 B 1 End NaN NaN 38
5 B 2 Begin 14 NaN NaN
6 B 2 End NaN NaN 21
7 C 1 Begin 3 NaN NaN
8 C 1 End NaN NaN 6
9 C 2 Begin 17 NaN NaN
10 C 2 Begin NaN 7 NaN
11 C 2 End NaN NaN 35
12 D 1 Begin 1 NaN NaN
13 D 1 End NaN NaN 5
g = data.groupby(['ID', 'Series'])
for c in cols:
data[c] = g[c].transform('first')
print (data)
ID Series Status Q R T
0 A 1 Begin 9 8 12
1 A 1 Begin 9 8 12
2 A 1 End 9 8 12
3 B 1 Begin 30 NaN 38
4 B 1 End 30 NaN 38
5 B 2 Begin 14 NaN 21
6 B 2 End 14 NaN 21
7 C 1 Begin 3 NaN 6
8 C 1 End 3 NaN 6
9 C 2 Begin 17 7 35
10 C 2 Begin 17 7 35
11 C 2 End 17 7 35
12 D 1 Begin 1 NaN 5
13 D 1 End 1 NaN 5
data[cols] = data[cols].mask(data.duplicated(['ID','Series'], keep='last'), '').fillna('')
print (data)
ID Series Status Q R T
0 A 1 Begin
1 A 1 Begin
2 A 1 End 9 8 12
3 B 1 Begin
4 B 1 End 30 38
5 B 2 Begin
6 B 2 End 14 21
7 C 1 Begin
8 C 1 End 3 6
9 C 2 Begin
10 C 2 Begin
11 C 2 End 17 7 35
12 D 1 Begin
13 D 1 End 1 5
推荐阅读
- nginx - 如何在 nginx 服务器中只监听 443
- reporting-services - SSRS - 分隔字符串的一部分并返回
- php - array_key_exists 和 ?? 在 php 7.x 中
- javascript - 如果它们有索引,如何登录控制台javascript数组项目属性
- azure - 一对一处理服务总线队列消息
- kubernetes - Helm chart 在 configmap 更改时重新启动 pod
- c - C、MPI:程序未终止且未打印数字
- mysql - Laravel 更新获取更新的 id
- entity-framework-core - SQLite 与 SQLServer 之间的实体框架核心数据类型转换差异
- tensorflow - 如何从 TFRecordDataset parse_single_example 生成的张量板上删除/重命名函数