首页 > 解决方案 > 用于聊天机器人回答的 Keras 多分类器

问题描述

我在 python 中实现了一个聊天机器人,该聊天机器人使用“意图”数据集进行训练,该数据集是这种形式的 json 文件:

{"intents": [
    {"tag": "greeting",
     "patterns": ["Hi there", "How are you", "Is anyone there?","Hey","Hola", "Hello", "Good day"],
     "responses": ["Hello, thanks for asking", "Good to see you again", "Hi there, how can I help?"],

    },
    {"tag": "goodbye",
     "patterns": ["Bye", "See you later", "Goodbye", "Nice chatting to you, bye", "Till next time"],
     "responses": ["See you!", "Have a nice day", "Bye! Come back again soon."],
     
    },
    {"tag": "thanks",
     "patterns": ["Thanks", "Thank you", "That's helpful", "Awesome, thanks", "Thanks for helping me"],
     "responses": ["Happy to help!", "Any time!", "My pleasure"],
     
    },
    {"tag": "noanswer",
     "patterns": [],
     "responses": ["Sorry, can't understand you", "Please give me more info", "Not sure I understand"],
     .
     .
     .

其中标签是用户问题(模式)的类别以及相关的可能响应。在训练阶段之前,数据集已经通过标记化提取模式的每个单词进行转换,然后应用词形还原。因此,训练集由带有相关标签(标签)的模式组成,其中模式表示为 Bag of Words,标签使用 one-hot 编码进行编码。那么模型定义如下:

model = Sequential()
model.add(Dense(128, input_shape=(x_train.shape[1],), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(classes), activation="softmax"))
# set the optimizer
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
# compile the model
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

训练了 500 个 epoch,批量大小为 16。

分类效果很好,模型能够在给定正确的“标签”的情况下正确分类看不见的问题。如果预测概率高于 0.75,则模型返回正确的标签,否则返回标签“ noanswer ”。

问题是当我向聊天机器人询问一个故意错误的问题时,写一个像“fejfeajlflnk”或类似的随机字符串来测试在哪种情况下返回的标签是“ noanswer ”(低预测概率,低于 0.75)分类预测总是类以高概率(从 0.8 到 0.99)与标签“问候”相关联,我无法理解这个事实。谁能帮我理解为什么分类器会这样?

标签: pythonmachine-learningkerasnlpchatbot

解决方案


如果这还没有解决..

请查找字段 Error_Threshold 并更改为 0.01。


推荐阅读