python - 使用 Tkinter 时出现未知的逻辑错误
问题描述
我正在研究垃圾邮件分类器模型,我已经使用 Tkinter 构建了 GUI,并且我成功地拥有并向按钮添加了对给定文本进行分类的功能。并且在运行它时不会引发错误。但是该模型不会将文本分类为垃圾邮件和火腿,无论输入文本可能预测始终是垃圾邮件,如果你们中的任何人都可以帮助我,那将是非常有帮助的。
我的模型的代码(具有按钮功能):
data=pd.read_csv("C:\\Users\\user\\Desktop\\Python\\Spyder\\Email spam Classifier\\spam.csv",
encoding='latin-1')
data['Category']=data['Category'].map({'ham':0, 'spam':1})
lemmatizer = WordNetLemmatizer()
corpus = []
for i in range(0, len(data)):
review = re.sub('[^a-zA-z]', ' ', data['Message'][i])
review = review.lower()
review = review.split()
review = [lemmatizer.lemmatize(word) for word in review if not word in
stopwords.words('english')]
review = ' '.join(review)
corpus.append(review)
cv = CountVectorizer()
X = cv.fit_transform(corpus).toarray()
y = pd.get_dummies((data['Category']))
y = y.iloc[:,1].values
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.20, random_state = 0)
Spam_clasfcn_model = MultinomialNB(alpha=.01)
Spam_clasfcn_model.partial_fit(X_train, y_train, classes=np.unique(y_train))
y_pred = Spam_clasfcn_model.predict(X_test)
def classify():
lab = ['not spam', 'spam']
x = cv.transform([spam.get()]).toarray()
p = Spam_clasfcn_model.predict(x)
s = [str(i) for i in p]
a = int("".join(s))
res = str("This is " + lab[a])
if lab[a]=='spam':
classification = Label(root, text=res, font=('helvetica', 15 , 'bold'), fg="red")
classification.pack()
else:
classification = Label(root, text=res, font=('helvetica', 15, 'bold'), fg="green")
classification.pack()
我的 GUI 代码:
root = tk.Tk()
root.geometry('500x200')
root.maxsize(500,200)
root.minsize(500,200)
root.title('SPAM EMAIL CLASSIFIER')
root['bg'] = "grey15"
title = Label(root, text="SPAM EMAIL CLASSIFIER",
font=('agencyFB',15,'bold'),fg="black")
title.place(x=90,y=5)
Label(root, text="Enter Mail to Classify",
font=('agencyFB',10,'bold'), fg="black").place(x=50,y=50)
spam = Entry(root, width=50, bg="white", relief=GROOVE,
borderwidth=2, border=2)
spam.place(x=50,y=80)
button = Button(root,text="Predict",
font=('agencyFB',8,'bold'),bg="grey",fg="white",
command=classify)
button.place(x=360,y=78)
root.mainloop()
使用的模块和包:
import pandas as pd
import nltk
import re
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from sklearn.feature_extraction.text import CountVectorizer
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
提前致谢!
解决方案
推荐阅读
- matlab - 在 Matlab 中优化 for 循环
- scala - 带有地图的案例类的 Spark 序列化问题
- javascript - 在本地 .html 文件中读取和写入本地 .JSON 文件,并将数据添加到
- excel - 在 Power Query 中使用多个维度按日期填充缺失的累积值
- android - Android Flutter 项目无法构建,出现各种错误
- arrays - PowerShell从数组中过滤基于以分号分隔的电子邮件,以管道分隔的数组中的对象值
- java - 如何在同一流语句中对 Collectors.groupedBy 的结果进行排序?
- sql - 如何聚合行并转换为列?
- django - 没有主键和 id 的 django 模型
- c# - 如何在.net核心中通过kafka和web api之间的标准开放遥测协议进行消息跟踪