python - Data Manipulation - Get data from another column if `nan`
问题描述
I have a Pandas DataFrame of 23 columns and 1119 rows.
Here is the issue, columns 13, 14, 20 and 21 are of float dtype.
If data in column 13 and 14 is nan
, then they are present in 20 and 21, and vice versa.
I want to create a column, if value is missing, get from the other.
Example: column 13 and 14 is nan
then get value from 20 and 21.
Here is what I came up with, I created a function and iterated using itertuples
def AP_calc(df):
for i in df.itertuples():
if i[20]==np.nan & i[21]==np.nan:
pool = i[13] + i[14]
else:
pool = i[20] + i[21]
return pool
then used an apply function but this does not work.
df["test"] = df[['AP in %','AP_M in %','FixP in €','FixP C in €']].apply(AP_calc,axis=1)
I have tried other methods too but not working, please help me out, please
解决方案
numpy.where
与由以下人员创建的掩码一起使用Series.isna
:
m = df['FixP in €'].isna() & df['FixP C in €'].isna()
df["test"] = np.where(m, df['AP in %'] + df['AP_M in %'], df['FixP in €'] + df['FixP C in €'])
或者:
c1 = ['FixP in €','FixP C in €']
c2 = ['AP in %','AP_M in %']
m = df[c2].isna().all(axis=1)
df["test"] = np.where(m, df[c1].sum(axis=1), df[c2].sum(axis=1))
替代按位置选择DataFrame.iloc
:
c1 = [20,21]
c2 = [13,14]
m = df.iloc[:, c2].isna().all(axis=1)
df["test"] = np.where(m, df.iloc[:, c1].sum(axis=1), df.iloc[:, c2].sum(axis=1))
推荐阅读
- ms-office - Microsoft Sway 不会在以前共享/显示的嵌入代码处自动刷新
- javascript - chrome dev-tool在调试带有时间戳的js文件时如何保留断点
- ansible - ansible chdir 模块在远程服务器上不起作用?
- .net - 在索引视图页面中显示下拉选择文本值
- gradle - Gradle exec 任务因“execCommand == null!”而失败
- node.js - React Socket NodeJs SocketManager 没有通过
- java - annox(https://github.com/highsource/annox)在使用 JAXB 编组/解组时是否支持嵌套集合对象?
- .net-core - 在 .Net Core Prerender.io 中间件中允许脚本标签
- c++ - 如何在 for 循环中暂停
- android - Firebase 计划通知不每天发送通知