首页 > 解决方案 > 即使输出不同,机器学习模型也会继续给出相同的结果

问题描述

my_dict = {'Stype': {'black': 1, 'clayey': 2, 'loamy': 3, 'red': 4, 'sandy': 5},
          'Ph': {'extremely acidic': 1, 'moderately acidic': 2, 'moderately alkaline': 3, 'neutral': 4, 'slightly acidic': 5, 'slightly alkaline': 6, 'strongly acidic': 7, 'strongly alkaline': 8, 'very strongly acidic': 9, 'very strongly alkaline': 10},
          }
    
labels = cropname['Stype'].astype('category').cat.categories.tolist()
replace1 = {'Stype': {k: v for k,v in zip(labels, list(range(1,len(labels) + 1)))}}
print(replace1)

labels1 = cropname['Ph'].astype('category').cat.categories.tolist()
replace2 = {'Ph': {k: v for k,v in zip(labels1, list(range(1,len(labels1) + 1)))}}
print(replace2)

cropname_replace = cropname.copy()

cropname_replace.replace(replace1, inplace=True)
cropname_replace.replace(replace2, inplace=True)
print(cropname_replace.head())

我从上面的程序得到的输出是:

   Temparature  Humidity   Moisture  Stype suitable-crop  Ph
0           26         52        38      5         Maize   5
1           32         62        34      4   Ground Nuts   4
2           29         52        45      3     Sugarcane   6
3           34         65        62      1        Cotton   2
4           26         14        35      5        Barley   4

然后我用随机森林模型拟合我的模型

y = cropname_replace['suitable-crop']
X = cropname_replace.drop(columns=['suitable-crop'])

from sklearn.model_selection import train_test_split
X_train, X_test, y_tain, y_test = train_test_split(X, y, test_size=0.2)

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()

model = model.fit(X_train,y_train)

predictions = model.predict(X_test)

该模型还给出了 0.9 的准确度值。但是当我通过它输入时,它会继续给出结果“大麦”。

pd.to_pickle(model,r'Desktop')
model = pd.read_pickle(r'Desktop')

Soiltype = input('Enter the soil type:').lower()
pH = input('Enter the pH type:').lower()
Temperature = input('Enter temperature:')
Humidity = input('Eneter Humidity:')
Moisture = input('Enter moisture:')


Stype1 = (my_dict['Stype'][Soiltype])

pH1 = (my_dict['Ph'][pH])

result = model.predict([[Stype1, pH1, Temperature, Humidity, Moisture]])
print(result)

这是我输出的 SS 2

数据集在这里给出:

标签: pythonpython-3.xmachine-learning

解决方案


from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X_train, y_train = make_classification(n_samples=1000, n_features=5,
                           n_informative=2, n_redundant=0,
                           random_state=0, shuffle=False)
model = RandomForestClassifier(max_depth=2, random_state=0)
model.fit(X_train, y_train)

请同时参考: https ://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html


推荐阅读