首页 > 解决方案 > 通过条件替换多列的优雅方法

问题描述

目前,如果满足某些条件,我将以下行替换列值。

df.loc[df['event'] == 1, 'pre'] = 0
df.loc[df['event'] == 1, 'post'] = 1

df.loc[df['event'] == 2, 'pre'] = 1
df.loc[df['event'] == 2, 'post'] = 0

df.loc[df['event'] == 4, 'pre'] = 1
df.loc[df['event'] == 4, 'post'] = 0

但是,这是不可扩展的。

我可以知道更有效的方法吗?

import numpy as np
import pandas as pd
nfreq=500
arr=np.array([[11850,0,1],
[12310,0,3],
[13924,0,4],
[16690,0,1],
[17082,0,3],
[18746,0,4],
[21956,0,2],
[22324,0,3],
[23694,0,4],
[25382,0,1],
[25776,0,3],
[28592,0,4],
[31676,0,2],
[32028,0,3],
[33498,0,4]])
trange = np.where(arr == 3)[0]
val=np.array([arr[trange, 0],arr[trange, 2],
          (arr[trange, 0]-arr[trange-1, 0])/nfreq,
          (arr[trange+1, 0]-arr[trange, 0])/nfreq]).T


trange= np.where(arr[:,2] != 3)[0]
val_oth=np.array([arr[trange, 0],arr[trange, 2],arr[trange, 2],arr[trange, 2]]).T
val_oth[:,2]=1
val_oth[:,-1]=1
df = pd.DataFrame(np.vstack((val,val_oth)),columns=['timepoint','event','pre','post'])
df.loc[df['event'] == 1, 'pre'] = 0
df.loc[df['event'] == 1, 'post'] = 1

df.loc[df['event'] == 2, 'pre'] = 1
df.loc[df['event'] == 2, 'post'] = 0

df.loc[df['event'] == 4, 'pre'] = 1
df.loc[df['event'] == 4, 'post'] = 0
df.sort_values(by='timepoint', ascending=True,inplace=True)
df.reset_index(drop=True,inplace=True)

标签: pandasconditional-statements

解决方案


您可以同时设置两列并在分配相同值时组合选择

df.loc[df.event==1, ['pre','post']] = [0,1]
df.loc[df.event.isin([2,4]), ['pre','post']] = [1,0]

如果event计算列,我建议转换为 int 以避免浮点错误

df['event'] = df.event.astype('int')


推荐阅读