首页 > 解决方案 > 使用多处理拟合模型后自定义集成 scikit-learn 模型 NotFittedError

问题描述

我正在尝试制作一个集成模型并使用多处理模块进行训练。有人可以解释为什么这会产生“NotFittedError”吗?这是我为演示发生了什么而制作的可重现示例:

import multiprocessing as mp
import numpy as np
from sklearn.ensemble import RandomForestClassifier

X = [np.random.normal(size=10) for _ in range(10)]
Y = [np.random.choice([0,1]) for _ in range(10)]

class ensemble(object):
    def __init__(self, num_models):
        self._models = [RandomForestClassifier() for _ in range(num_models)]

    def train(self, training_data, training_labels):
        _fit = lambda model, data, labels: model.fit(data, labels)
        processes = [mp.Process(target=_fit, args=(self._models[index],
                                                   training_data,
                                                   training_labels)) for index in range(len(self._models))]
        [p.start() for p in processes]
        [p.join() for p in processes]

        # try to predict with one of the models in the ensemble
        self._models[0].predict(training_data)

e = ensemble(num_models=2)
e.train(X, Y)

标签: pythonnumpymachine-learningscikit-learnmultiprocessing

解决方案


推荐阅读