首页 > 解决方案 > Pandas np.where 或 np.select 在 1 行代码中生成多列

问题描述

我有一个df:

contract    RB  BeginDate   ValIssueDate    EndDate Valindex0
1   A00118  46  19000100    19880901    19841231    50
2   A00118  46  19850100    19880901    99999999    50
3   A00118  47  19000100    19880901    19831231    47
4   A00118  47  19840100    19880901    19841299    47

我需要根据条件添加新列:

condition = True

df['Valindex0']=np.where(condtion,df['Valindex0'],None)
df['RB']=np.where(condtion,df['RB'],None)

无论如何我可以像这样通过1行来做到这一点:

conditions=[True,True]

df[['Valindex0','RB']]=np.select[conditions,[df['Valindex0'],df['RB']],default=None]

或者

df['Valindex0'],df['RB']=np.select[conditions,[df['Valindex0'],df['RB']],default=None]

标签: pythonpandasdataframenumpy

解决方案


np.where如果所有参数都是兼容的形状,则将起作用。一种选择是tile条件与感兴趣的列的形状相同:

condition = df['RB'].eq(46)  # Some more interesting condition than True
df[['RB', 'Valindex0']] = np.where(
    np.tile(condition.values[:, None], 2), # Make condition match DataFrame columns
    df[['RB', 'Valindex0']],
    None
)

df

  contract    RB  BeginDate  ValIssueDate   EndDate Valindex0
0   A00118    46   19000100      19880901  19841231        50
1   A00118    46   19850100      19880901  99999999        50
2   A00118  None   19000100      19880901  19831231      None
3   A00118  None   19840100      19880901  19841299      None

通过创建具有正确形状的结构(如另一个 DataFrame),也可以完成多个条件:

df[['RB', 'Valindex0']] = np.where(
    pd.DataFrame({
        'cond1': df['RB'].eq(46),
        'cond2': df['Valindex0'].eq(47)
    }),
    df[['RB', 'Valindex0']],
    None
)

df

  contract    RB  BeginDate  ValIssueDate   EndDate Valindex0
0   A00118    46   19000100      19880901  19841231      None
1   A00118    46   19850100      19880901  99999999      None
2   A00118  None   19000100      19880901  19831231        47
3   A00118  None   19840100      19880901  19841299        47

数据框和导入:

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'contract': ['A00118', 'A00118', 'A00118', 'A00118'],
    'RB': [46, 46, 47, 47],
    'BeginDate': [19000100, 19850100, 19000100, 19840100],
    'ValIssueDate': [19880901, 19880901, 19880901, 19880901],
    'EndDate': [19841231, 99999999, 19831231, 19841299],
    'Valindex0': [50, 50, 47, 47]
})

推荐阅读