python - 熊猫用 grouby 值填充空值
问题描述
我正在尝试为数据框中的所有数字类型列填充空值。
下面的代码遍历每个数字列并按分类特征分组,并计算目标列的中位数。
然后,我们创建一个新列,如果存在则复制值,但如果它为空,则它应该根据存在 n/a 的行中的分类值从 groupby 复制值。
# fill in numeric nulls with median based on job
for i in dfint:
print(i)
for i in dfint:
if i in ["TARGET_BAD_FLAG", "TARGET_LOSS_AMT"]: continue
print(i)
group=df.groupby("JOB")[i].median()
print(group)
df["IMP_"+i]=df[i].fillna(group[group.index.get_loc(df.loc[df[i].isna(),"JOB"])])
#the line below works but fills in all nulls with the median for the "Mgr" job category, the code above should find the job category for the null record and pull the groupby value
#df["IMP_"+i]=df[i].fillna(group[group.index.get_loc("Mgr")])
我似乎对 .get_loc 之间的函数有问题,这是输出
TARGET_BAD_FLAG
TARGET_LOSS_AMT
LOAN
MORTDUE
VALUE
YOJ
DEROG
DELINQ
CLAGE
NINQ
CLNO
DEBTINC
LOAN
JOB
Mgr 18100
Office 16200
Other 15200
ProfExe 17300
Sales 14300
Self 24000
Name: LOAN, dtype: int64
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-207-f8a76179c818> in <module>
8 group=df.groupby("JOB")[i].median()
9 print(group)
---> 10 df["IMP_"+i]=df[i].fillna(group[group.index.get_loc(df.loc[df[i].isna(),"JOB"])])
11 #the line below works but fills in all nulls with the median for the "Mgr" job category, the code above should find the job category for the null record and pull the groupby value
12 #df["IMP_"+i]=df[i].fillna(group[group.index.get_loc("Mgr")])
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2895 )
2896 try:
-> 2897 return self._engine.get_loc(key)
2898 except KeyError:
2899 return self._engine.get_loc(self._maybe_cast_indexer(key))
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
TypeError: 'Series([], Name: JOB, dtype: object)' is an invalid key
有没有办法修改该行以按预期进行
解决方案
你写了这个:df.loc[df[i].isna(),"JOB"]
它将返回一个熊猫系列,而不是pandas.Index.get_loc要求的键
推荐阅读
- node.js - MERN App中的React useEffect警告:无法对未安装的组件执行React状态更新
- reactjs - 带有动态地图路由器的打字稿出现类型错误
- flutter - Agora 云录制返回代码 435,但在动态生成 token/cname 时可以正常工作
- database - 如何为多个动态表单设置存储和验证?
- python-3.x - Django rest 框架遗留数据库外键在 API 上显示为 Null
- python - 如何将项目从列表附加到列表框?
- android - Android Kotlin 本地化
- path - make: riscv64-unknown-elf-gcc: Command not found.But 我已经在 /etc/profile 中设置了 .bashrc
- three.js - 关于嵌入式 3D 文件导出到 Web 以使其可旋转的问题
- erlang - gen_server 和 gen_statem 有什么区别?