python - 如何根据变化的列值 pd.fillna(mean())?
问题描述
我有以下数据框:
data/hora
2017-08-18 09:22:33 22162 NaN 65.9 NaN NaN
2017-10-03 11:08:26 22162 NaN 60.5 NaN NaN
2018-02-17 01:45:24 22162 NaN 69.7 NaN NaN
2018-02-17 01:45:55 74034 NaN 67.5 NaN NaN
2018-02-17 01:46:29 74034 NaN 65.4 NaN NaN
2018-02-17 01:47:20 74034 NaN 63.3 NaN NaN
2018-02-17 01:48:35 74034 NaN 61.3 NaN NaN
2018-02-17 01:49:08 17448 NaN 63.4 NaN NaN
2018-02-17 01:49:31 17448 NaN 65.5 NaN NaN
2018-02-17 01:49:55 17448 NaN 67.6 NaN NaN
我想将 NaN 填充为哪一列的平均值。但是,此值会随着“机器”的变化而变化 - 存在三个机器值。因此,我需要fillna
根据 Machine 列值进行更改。
我试过:
for i in df:
if i.isin(df.loc[df['Machine'] == '22162']):
df.fillna(df.loc[df['Machine'] == '22162'].mean)
elif i.isin(df.loc[df['Machine'] == '17448']):
df.fillna(df.loc[df['Machine'] == '17448'].mean)
elif i.isin(df.loc[df['Machine'] == '74034']):
df.fillna(df.loc[df['Machine'] == '74034'].mean)
但它没有用。
谢谢!
解决方案
它有点到处都是硬编码,但它应该可以工作。我命名了 NaN 列['A', 'C', 'D']
data hora machine A B C D
0 2017-08-18 09:22:33 22162 NaN 65.9 NaN NaN
1 2017-10-03 11:08:26 22162 NaN 60.5 NaN NaN
2 2018-02-17 01:45:24 22162 NaN 69.7 NaN NaN
3 2018-02-17 01:45:55 74034 NaN 67.5 NaN NaN
4 2018-02-17 01:46:29 74034 NaN 65.4 NaN NaN
5 2018-02-17 01:47:20 74034 NaN 63.3 NaN NaN
6 2018-02-17 01:48:35 74034 NaN 61.3 NaN NaN
7 2018-02-17 01:49:08 17448 NaN 63.4 NaN NaN
8 2018-02-17 01:49:31 17448 NaN 65.5 NaN NaN
9 2018-02-17 01:49:55 17448 NaN 67.6 NaN NaN
columns = ['A', 'C', 'D']
for clm in columns:
df[clm] = df[clm].fillna(df.machine.map(df.groupby('machine')['B'].mean().to_dict()))
结果是
data hora machine A B C D
0 2017-08-18 09:22:33 22162 65.366667 65.9 65.366667 65.366667
1 2017-10-03 11:08:26 22162 65.366667 60.5 65.366667 65.366667
2 2018-02-17 01:45:24 22162 65.366667 69.7 65.366667 65.366667
3 2018-02-17 01:45:55 74034 64.375000 67.5 64.375000 64.375000
4 2018-02-17 01:46:29 74034 64.375000 65.4 64.375000 64.375000
5 2018-02-17 01:47:20 74034 64.375000 63.3 64.375000 64.375000
6 2018-02-17 01:48:35 74034 64.375000 61.3 64.375000 64.375000
7 2018-02-17 01:49:08 17448 65.500000 63.4 65.500000 65.500000
8 2018-02-17 01:49:31 17448 65.500000 65.5 65.500000 65.500000
9 2018-02-17 01:49:55 17448 65.500000 67.6 65.500000 65.500000
可能不是最好的方法,但可以完成工作。
推荐阅读
- angular - http.post 给出 TypeError: "cyclic object value"
- javascript - js在点击时工作或不工作
- javascript - 响应在为 Facebook 登录编写的脚本中返回为未定义
- java - 无法从 KafkaProducer 发布消息
- ios - 选择文本时iOS光标头被截断
- javascript - 比较javascript中的两个字符串
- vba - 将数据从一个 Excel 工作簿复制到另一个工作簿,同时保留格式
- javascript - 如何在动作处理程序中获取原始事件对象?
- r - 根据 R 中其他列中的重复值将值粘贴到数据框中
- node.js - Lambda 总是返回 200