首页 > 解决方案 > 熊猫矢量化操作中的多个布尔条件

问题描述

我喜欢矢量化,我关注 df

df = pd.DataFrame({'p1':['apple','orange'],
                   'p1_dog':['True', 'False'],
                   'p2':['quick','start'],
                   'p2_dog':['True', 'True'],
                   'p3':['ash','sword'],
                   'p3_dog':['False','False']})

尝试创建值等于 p1 或 p2 或 p3 的新列取决于 p1_dog 和 p2_dog 和 p3_dog 中的值。

使用此代码:

df['final'] = 0
df['final'] = [[(p1 if p1_dog == p2_dog == p3_dog == True)\
                     | (p2 if (p1_dog == False) &  (p2_dog == p3_dog == True)\
                        |(p3 if (p1_dog == p2_dog == False) & (p3_dog == True))) for x in df['final']]]  

虽然它不起作用......请帮助 - 我的错误在哪里?

标签: pythonpandasvectorization

解决方案


mortysporty给出的答案的工作版本......再次感谢同志!只是稍微增强了布尔值

def evaluate(p1, p2, p3, p1_dog, p2_dog, p3_dog):

if (p1_dog and p2_dog and p3_dog) or (p1_dog and p2_dog) or (p1_dog and p3_dog) or (p1_dog):
    return p1
elif (p2_dog and p3_dog) or (p2_dog):
    # If you are getting here... p1_dog must be False
    return p2
elif p3_dog:
    # ...same here. p1_dog and p2_dog must be False
    return p3
else:
    return "I dont know what you want to happen here"

a = pd.DataFrame({'p1':['apple','orange', 'ball'],
               'p1_dog':[True, False, False],
               'p2':['quick','start', 'heck'],
               'p2_dog':[False, True, True],
               'p3':['ash','sword', 'soop'],
               'p3_dog':[True, False, True]})

a['final'] = [evaluate(*p) for p in zip(a['p1'], a['p2'], a['p3'],
               a['p1_dog'], a['p2_dog'], a['p3_dog'])]

推荐阅读