首页 > 解决方案 > 输入包含 NaN

问题描述

我是 python 新手,我一直在研究这个分类数据集来预测肥料。input contains NaN即使我删除了包含任何 nan 值的行,我也会收到错误消息。我真的希望有人能帮我解决这个问题。提前谢谢你。

import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
    
features = pd.read_csv('Fertilizer Prediction.csv')
features.head(5)
    
features.dropna(how='any').shape
    
y = features['Name']
X = features.drop(columns=['Name'])
    
for col in X.dtypes[X.dtypes == 'object'].index:
    for_dummy = X.pop(col)
     X = pd.concat([X, pd.get_dummies(for_dummy, prefix=col)], axis=1)
    
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
    
y_train.values.ravel()
X_train.values.ravel()
    
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier
    
model().fit(X_train, y_train)

[这些是错误的截图][1]

我使用的数据集来自 Kaggle,我将在下面链接它: https ://www.kaggle.com/gdabhishek/fertilizer-prediction?select=Fertilizer+Prediction.csv

标签: python

解决方案


从 的文档中dropna,您需要有inplace=True才能删除NaN和更改数据框。因此,鉴于您的代码,您需要替换该行:

features.dropna(how='any').shape 

features.dropna(how='any',inplace=True)

推荐阅读