首页 > 解决方案 > SVC Classifier to Keras CNN with probabilities or confidence to distinguish untrained classes

问题描述

This question is pretty similar to this one and based on this post over GitHub, in the sense that I am trying to convert an SVM multiclass classification model (e.g., using sklearn) to a Keras model.

Specifically, I am looking for a way of retrieving probabilities (similar to SVC probability=True) or confidence value at the end so that I can define some sort of threshold and be able to distinguish between trained classes and non-trained ones. That is if I train my model with 3 or 4 classes, but then use a 5th that it wasn't trained with, it will still output some prediction, even if totally wrong. I want to avoid that in some way.

I got the following working reasonably well, but it relies on picking the maximum value at the end (argmax), which I would like to avoid:

  model = Sequential()
  model.add(Dense(30, input_shape=(30,), activation='relu', kernel_initializer='he_uniform'))
  # output classes
  model.add(Dense(3, kernel_regularizer=regularizers.l2(0.1)))
  # the activation is linear by default, which works; softmax makes the accuracy be stuck 33% if targeting 3 classes, or 25% if targeting 4.
  #model.add(Activation('softmax')) 
  model.compile(loss='categorical_hinge', optimizer=keras.optimizers.Adam(lr=1e-3), metrics=['accuracy'])

Any ideas on how to tackle this untrained-class problem? Something like Plat scaling or Temperature scaling would work, if I can still save the model as onnx.

标签: pythontensorflowmachine-learningkerassvm

解决方案


正如我所怀疑的,通过缩放模型的特征(输入)来让softmax工作。不需要停止渐变或任何东西。我专门使用了非常大的数字,尽管训练得很好,但却阻止了softmax(逻辑回归)正常工作。例如,可以通过以下代码对特征进行缩放:

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_std = scaler.fit_transform(X)

通过这样做,使用 keras 的类 SVM 模型的输出将按照最初的预期输出概率。


推荐阅读