python - 将带有 numpy 变体的代码替换为带有 pandas 变体的代码时出错
问题描述
这是一个代码:
# from: https://stackoverflow.com/questions/60101845/compare-multiple-pandas-columns-1st-and-2nd-after-3rd-and-4rth-after-etc-wit
# from: https://stackoverflow.com/questions/27474921/compare-two-columns-using-pandas?answertab=oldest#tab-top
# from: https://stackoverflow.com/questions/60099141/negation-in-np-select-condition
import pandas as pd
import numpy as np
df = pd.DataFrame({ 'var1': ['a', 'b', 'c',np.nan, np.nan],
'var2': [1, 2, np.nan , 4, np.nan],
'var3': [np.nan , "x", np.nan, "y", "z"],
'var4': [np.nan , 4, np.nan, 5, 6],
'var5': ["a", np.nan , "b", np.nan, "c"],
'var6': [1, np.nan , 2, np.nan, 3]
})
col1 = ["var1", "var3", "var5"]
col2 = ["var2", "var4", "var6"]
colR = ["Result1", "Result2", "Result3"]
s1 = df[col1].isnull().to_numpy()
s2 = df[col2].isnull().to_numpy()
conditions = [~s1 & ~s2, s1 & s2, ~s1 & s2, s1 & ~s2]
choices = ["Both values", np.nan, df[col1], df[col2]]
df = pd.concat([df, pd.DataFrame(np.select(conditions, choices), columns=colR, index=df.index)], axis=1)
结果 ( df
) 如下所示:
var1 var2 var3 var4 var5 var6 Result1 Result2 Result3
0 a 1.0 NaN NaN a 1.0 Both values NaN Both values
1 b 2.0 x 4.0 NaN NaN Both values Both values NaN
2 c NaN NaN NaN b 2.0 c NaN Both values
3 NaN 4.0 y 5.0 NaN NaN 4 Both values NaN
4 NaN NaN z 6.0 c 3.0 NaN Both values Both values
它有效,但存在缺失值的问题choices
(更多关于它的信息,但这对我当前的问题并不重要)。现在我需要使用而不是这样的代码np.select(conditions, choices)
(想法是使用 pandas 而不是 numpy 以避免上面链接中描述的问题):
pd.DataFrame(choices).where(conditions).ffill().fillna(0).iloc[-1]
或这个:
pd.DataFrame(choices).where(conditions,0).sum()
如果我只是替换代码部分,我会收到错误:
runfile('D:/del/untitlejhgd0.py', wdir='D:/del')
Traceback (most recent call last):
File "<ipython-input-16-10cdf307d77c>", line 1, in <module>
runfile('D:/del/untitlejhgd0.py', wdir='D:/del')
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "D:/del/untitlejhgd0.py", line 29, in <module>
df = pd.concat([df, pd.DataFrame((pd.DataFrame(conditions).where(choices).ffill().fillna(0).iloc[-1]), columns=colR, index=df.index)], axis=1)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py", line 488, in __init__
mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 169, in init_ndarray
values = prep_ndarray(values, copy=copy)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 295, in prep_ndarray
raise ValueError("Must pass 2-d input")
ValueError: Must pass 2-d input
问题:如何替换上面的代码部分以使代码工作?
解决方案
推荐阅读
- c++ - 为什么 view_interface::data const 重载需要额外检查 const D 是否满足 range::range?
- angular - CKEditor 5 表格工具栏配置无法以角度显示单元格属性
- vue.js - Vue JS:属性或方法“vehcile”未在实例上定义,但在渲染期间引用
- http - 使用 Twitter 的 account/update_profile API 更新描述
- angular - 无法在 Angular 9 中安装 ngx-doc-viewer
- git - 停止在 Visual Studio 的“Git Changes”中显示的随机文件
- python - 为什么当我在其中使用局部变量时,eval 函数无法识别三角函数?
- python - http://localhost/5000 在 docker 烧瓶中不起作用
- python - Python莫霍克错误。返回“异常”消息
- python - 如何使用pyPdf2获取PDF索引的数据框