首页 > 解决方案 > 从 Futures 对象检索结果()时,将 Keras 与 ProcessPoolExecutor 一起使用会给出“TypeError:无法腌制 _thread.RLock 对象”

问题描述

我正在尝试并行化 Francois Chollet 的 Python 深度学习书中波士顿房价回归的 k 折验证示例。这为使用 Keras 的回归构建了一个小型 NN。NN 在四个不同的训练示例上进行训练,这些训练示例是通过更改保留示例进行验证而获得的。这是四个独立的训练运行,应该很容易并行化,或者我是这么认为的。

我在 Jupiter notebook 中的 Anaconda 下运行 Python 3.7.3 和相当新的 Keras。我决定尝试使用 concurrent.futures 库进行并行化。

我查看了其他相关的 StackOverflow 问题,但它们并没有解决我的问题。检查点 keras 模型:TypeError: can't pickle _thread.lock objects can't pickle _thread.RLock objects using keras etc...

import numpy as np
import tensorflow as tf

boston_housing = tf.keras.datasets.boston_housing
models = tf.keras.models
layers = tf.keras.layers

# load data
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()

# normalize the features (subtract mean and divide by standard deviation)
mean = train_data.mean(axis=0)
std = train_data.std(axis=0)
train_data -= mean
train_data /= std
test_data -= mean
test_data /= std

# model definition
def build_model():
    model = models.Sequential()
    model.add(layers.Dense(64, activation='relu', input_shape = (train_data.shape[1],)))
    model.add(layers.Dense(64, activation='relu'))
    model.add(layers.Dense(1))
    model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])
    return model

# build and train model on one fold of training/validation split
def build_and_train_model( fold, num_folds = 4, num_epochs = 100 ):
    num_val_samples = len(train_data) // num_folds
    start_idx = fold*num_val_samples
    end_idx = (fold+1)*num_val_samples
    partial_train_data = np.concatenate(
        [train_data[:end_idx], train_data[end_idx:]], axis = 0)
    partial_train_targets = np.concatenate(
        [train_targets[:end_idx], train_targets[end_idx:]], axis = 0)
    model = build_model()
    history = model.fit(partial_train_data, partial_train_targets, epochs=num_epochs, 
                        batch_size=1, verbose=0)
    return (model, history)

from concurrent.futures import ProcessPoolExecutor

k = 4
num_epochs = 10
futures = []
with ProcessPoolExecutor() as executor:
    for i in range(k):
        future = executor.submit(build_and_train_model, i, 
                                 num_folds = k, num_epochs = num_epochs )
        futures.append(future)
    models = []
    histories = []
    for future in futures:
        model, history = future.result() 
        models.append(models)
        histories.append(history)
This is the error message:

---------------------------------------------------------------------------
_RemoteTraceback                          Traceback (most recent call last)
_RemoteTraceback: 
"""
Traceback (most recent call last):
  File "//anaconda3/envs/tf/lib/python3.7/concurrent/futures/process.py", line 205, in _sendback_result
    exception=exception))
  File "//anaconda3/envs/tf/lib/python3.7/multiprocessing/queues.py", line 358, in put
    obj = _ForkingPickler.dumps(obj)
  File "//anaconda3/envs/tf/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects
"""

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
<ipython-input-6-499a12f73b9e> in <module>
     13     histories = []
     14     for future in futures:
---> 15         model, history = future.result()
     16         models.append(models)
     17         histories.append(history)

//anaconda3/envs/tf/lib/python3.7/concurrent/futures/_base.py in result(self, timeout)
    433                 raise CancelledError()
    434             elif self._state == FINISHED:
--> 435                 return self.__get_result()
    436             else:
    437                 raise TimeoutError()

//anaconda3/envs/tf/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

TypeError: can't pickle _thread.RLock objects

有没有办法解决这个错误?或者是否有不同的方法来实现我想要做的——即启动多个并行培训课程?

标签: pythonkerasjupyter-notebookconcurrent.futurestf.keras

解决方案


推荐阅读