首页 > 解决方案 > Data Manipulation - Get data from another column if `nan`

问题描述

I have a Pandas DataFrame of 23 columns and 1119 rows.

Here is the issue, columns 13, 14, 20 and 21 are of float dtype.

If data in column 13 and 14 is nan, then they are present in 20 and 21, and vice versa.

I want to create a column, if value is missing, get from the other.

Example: column 13 and 14 is nan then get value from 20 and 21.

Here is what I came up with, I created a function and iterated using itertuples

def AP_calc(df):
    for i in df.itertuples():
        if i[20]==np.nan & i[21]==np.nan:
           pool = i[13] + i[14]
        else:
            pool = i[20] + i[21]
        return pool

then used an apply function but this does not work.

df["test"] = df[['AP in %','AP_M in %','FixP in €','FixP C in €']].apply(AP_calc,axis=1)

I have tried other methods too but not working, please help me out, please

标签: pythonpandasdataframedata-manipulation

解决方案


numpy.where与由以下人员创建的掩码一起使用Series.isna

m = df['FixP in €'].isna() & df['FixP C in €'].isna()
df["test"] = np.where(m, df['AP in %'] + df['AP_M in %'], df['FixP in €'] + df['FixP C in €'])

或者:

c1 = ['FixP in €','FixP C in €']
c2 = ['AP in %','AP_M in %']

m = df[c2].isna().all(axis=1)
df["test"] = np.where(m, df[c1].sum(axis=1), df[c2].sum(axis=1))

替代按位置选择DataFrame.iloc

c1 = [20,21]
c2 = [13,14]

m = df.iloc[:, c2].isna().all(axis=1)
df["test"] = np.where(m, df.iloc[:, c1].sum(axis=1), df.iloc[:, c2].sum(axis=1))

推荐阅读