python - 如何在 Flask 的 sklearn 管道中使用自定义转换?
问题描述
我无法在 Flask 中使用带有自定义转换的 sklearn 管道,使用picke.dump()
.
假设我想在泡菜中使用以下管道来服务 Flask。它选择变量,创建虚拟变量并分离进入模型的变量。
# Custom Transformer that extracts columns passed as argument to its constructor
class FeatureSelector(BaseEstimator, TransformerMixin):
# Class Constructor
def __init__(self, feature_names):
self._feature_names = feature_names
# Return self nothing else to do here
def fit(self, X, y=None):
return self
# Method that describes what we neeed this transformer to do
def transform(self, X, y=None):
return X[self._feature_names]
# Defining the steps in the categorical pipeline
categorical_features = ['marital', 'contact', 'job']
# Converts certain features to binary
class CategoricalBinary(TransformerMixin):
# Class Constructor
#def __init__(self):
# Return self nothing else to do here
def fit(self, X, y=None):
return self
# Faz as transformações com a função get_dummies
def transform(self, X, y=None):
X = pd.get_dummies(X, columns=X.columns.tolist())
return X
# Custom Transformer that extracts columns passed as argument to its constructor
class ModelFeatureSelector(BaseEstimator, TransformerMixin):
# Class Constructor
def __init__(self, feature_names):
self._feature_names = feature_names
# Return self nothing else to do here
def fit(self, X, y=None):
return self
# Method that describes what we neeed this transformer to do
def transform(self, X, y=None):
return X[self._feature_names]
model_features = ['marital_divorced', 'marital_married', 'marital_single',
'contact_cellular', 'job_admin.', 'job_blue-collar',
'job_entrepreneur', 'job_housemaid', 'job_management',
'job_retired', 'job_self-employed', 'job_services',
'job_student', 'job_technician', 'job_unemployed']
categorical_transform = Pipeline(steps=[('feature_selector', FeatureSelector(categorical_features)),
('categorical_dummy', CategoricalBinary()),
('model_features', ModelFeatureSelector(model_features)),
('logreg', LogisticRegression(class_weight='balanced', solver='liblinear'))])
# Fits
categorical_transform.fit(X_train, y_train)
# Save on pickle
with open('categorical_transform.pkl', 'wb') as f:
pickle.dump(categorical_transform, f)
我编写了 Flask
# Importa as classes necessárias do pacote `flask`
from flask import Flask, request, jsonify
# Importa o pacote de interação com o sistema `os`, o pacote `pandas` para maniulação da informação
# e o pacote `pickle` para carregar nosso modelo.
import os
import pandas as pd
import pickle
from sklearn.pipeline import FeatureUnion, Pipeline
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.impute import SimpleImputer
import outros
# Cria a aplicação
app = Flask(__name__)
# Model
model = pickle.load(open('categorical_transform.pkl', 'rb'))
# TAREFA: PREENCHA O MÉTODO DE REQUISIÇÃO
@app.route('/predict', methods=['POST'])
def predictor():
# Recebe o conteúdo da postagem no formato json.
content = request.json
features = pd.DataFrame([content])
# Aplica a predição do modelo
predito = model.predict(features)
# Cria e envia uma resposta para o 'chamador' da API
return jsonify(status='completed', predict=float(predito[1][1]))
# Essa linha garante que a aplicação execute no localhost, ou seja, no IP "0.0.0.0"
# e que esteja na porta padrão do sistema ou, caso ela não exista, na porta 8080.
if __name__ == '__main__':
app.run(debug=True,host='0.0.0.0',port=os.environ.get('PORT', 8080))
引发以下异常:
AttributeError: Can't get attribute 'FeatureSelector'
解决方案
您已经FeatureSelector
在一些 python 脚本中创建了一个自定义转换器。在 Flask 脚本中,您导入该自定义类的父类,但没有导入FeatureSelector
自身。因此,Flask 脚本中没有自定义类的定义。因此,pickle 无法重建对象。
假设FeatureSelector
是在 custom_classes.py 中定义的。然后你的 Flask 脚本应该包含一行:
from custom_classes import FeatureSelector
推荐阅读
- magento2.2 - 电子邮件验证 Magento 2.2
- android - 使用 gradle kotlin dsl 时未解决的参考 kotlintest
- garbage-collection - 内存不会在多个线程中释放
- ssl - 即使启用了 Starttls,Gmail 也不使用 TLS 将电子邮件传递到 apache james 服务器
- python - 寻找根的伴随矩阵
- api - 如何为巴士反应原生创建座位布局
- java - 在 Java 中反序列化响应时更改属性名称
- .net - 无法在 plesk 上解压存档
- php - 从下拉选择中获取真/假文本结果
- opencv - (-215:断言失败)1 <= blobs.size() && blobs.size() <= 2