首页 > 解决方案 > python中列值的条件赋值

问题描述

嗨,我有一个包含 3 列的 excel 表,我将其读入数据框 df:

在此处输入图像描述

df 打印正确。

现在我需要在上述 3 列的基础上添加 2 个新列

在此处输入图像描述 下面是我为此编写的函数,它没有正确评估

def setData(FinalCombinedSet):
    if(df['age']==np.nan):
        df['Final Column']='age column blank'
        df['Mumbai category']='NO'
        return df
    elif(df['name']=='def' and df['place']=='Mumbai'):
        df['Final column'] = 'category Mumbai'
        df['Mumbai category']='YES'
        return df
    else:
        df['Final Column'] = 'Other values'
        df['Mumbai category']='NO'
        return  df
df=df.apply(lambda df:setData(df),axis=1)

标签: python-3.xpandasdataframe

解决方案


用于np.select矢量化方法:

import pandas as pd
import numpy as np

df = pd.DataFrame({"name":["abc","def","ghi"],
                   "age":["","20","22"],
                   "place":["Bangalore","Mumbai","Mumbai"]})

df["final column"] = np.select([df["age"]=="",(df["name"]=="def")&(df["place"]=="Mumbai")],
                               ['age column blank','category Mumbai'],
                               default="Other values")
df['Mumbai category'] = np.select([df["age"]=="",(df["name"]=="def")&(df["place"]=="Mumbai")],
                               ["NO","YES"],
                               default="NO")

print (df)

#
  name age      place      final column Mumbai category
0  abc      Bangalore  age column blank              NO
1  def  20     Mumbai   category Mumbai             YES
2  ghi  22     Mumbai      Other values              NO

推荐阅读