python - Scikit Learn 中的分类朴素贝叶斯给出了 IndexError
问题描述
我尝试使用sklearn来比较不同的分类方法。我有带有姓氏、姓名、性别值的字符串数据,我想定义分类器如何处理性别值。但是,我在分类朴素贝叶斯中遇到错误:
import pandas as pd
import numpy as np
from sklearn.naive_bayes import CategoricalNB
from sklearn.model_selection import train_test_split
from sklearn import metrics
if __name__ == "__main__":
csv_with_all_surnames = folder_path_one_level_up + os.sep + "Results" + os.sep + "surnames_labeled_all.csv"
csv_naive_bayes_categorical_results = folder_path_one_level_up + os.sep + "Results" + os.sep + "Naive_Bayes_Categorical_results_names_only.csv"
total_accuracy_scores = []
data_to_be_tested = pd.read_csv(csv_with_all_surnames,header=None)
column_names = ['Surname', 'Name', 'Gender']
data_to_be_tested.columns=column_names
names_only = data_to_be_tested.drop(['Surname', 'Gender'],axis=1)
genders=data_to_be_tested['Gender']
names_only = names_only.apply(lambda x: pd.factorize(x)[0])
genders=genders.factorize()
genders=genders[0].copy()
accuracy_score = []
for test_percent in range(99, 0, -1):
x_train, x_test, y_train, y_test = train_test_split(names_only, genders, test_size=(test_percent/100), shuffle=False)
classifier_naive_bayes_categorical = CategoricalNB()
classifier_naive_bayes_categorical = classifier_naive_bayes_categorical.fit(x_train, y_train)
y_pred_naive_bayes_categorical = classifier_naive_bayes_categorical.predict(x_test)
naive_bayes_categorical_accuracy_score = round(metrics.accuracy_score(y_test, y_pred_naive_bayes_categorical), 3)
file_results_log = open("logFile6.txt","a+")
file_results_log.write(text + "\n")
file_results_log.close()
accuracy_score.append(naive_bayes_categorical_accuracy_score)
total_accuracy_scores.append(accuracy_score)
"C:\My Files\Upper level folders\Test\Results\surnames_labeled_all.csv" exists.
Traceback (most recent call last):
File "C:\My Files\Upper level folders\Test\python\naive_bayes_categorical_comp_no_duplicates_all_names_name.py", line 85, in <module>
y_pred_naive_bayes_categorical = classifier_naive_bayes_categorical.predict(x_test)
File "C:\Users\ulvi95\anaconda3\lib\site-packages\sklearn\naive_bayes.py", line 75, in predict
jll = self._joint_log_likelihood(X)
File "C:\Users\ulvi95\anaconda3\lib\site-packages\sklearn\naive_bayes.py", line 1303, in _joint_log_likelihood
jll += self.feature_log_prob_[i][:, indices].T
IndexError: index 815 is out of bounds for axis 1 with size 815
问题应该如何解决?
解决方案
推荐阅读
- wordpress - 激活 wordpress Divi 子主题时缺少字体
- php - 磁盘驱动器达到一定百分比时的电子邮件通知?(拉拉维尔)
- sql-server - SQL中的Vlookup
- javascript - 使用 HTML 模板构建闪亮的应用程序
- java - 如何可靠地从 Java 应用程序触发 Airflow DAG?(没有实验性 API)
- python - 防止/捕获 SWIG 包装的模块退出 Python 会话?
- r - 如何在 R 中突出显示闪亮仪表板中的单词?
- python - 如何将 BatchNormalization 应用于 Keras LSTM 的输入?
- java - 使用 Mockito 验证 lambda 上的方法调用
- c# - 将控件转发到 Luis 的 C# bot 框架 v4 中的特定类的问题