首页 > 解决方案 > 当我尝试加载和使用预训练模型时出现 Tensorflow Keras 错误

问题描述

如前所述,我想使用预训练模型,但仅用作特征提取器。我的想法是建立一个新模型并为其加载权重:

    model = get_tiny_darknet_model() #get the structure of the model
    transfer_layer = model.get_layer('global_average_pooling2d')
    feature_extractor = Model(inputs=model.input,
                              outputs=transfer_layer.output)
    feature_extractor.load_weights('checkpoints\\checkpoint.h5', by_name=True)

但是当我尝试使用这个特征提取器进行预测时,出现了错误。

train_data = raw_preprocess(train_data) #use a ImageDataGenerator to preprocess the data
train_samples = feature_extractor.predict(train_data)

在我运行此代码后,会出现以下错误:

2020-06-12 16:22:09.409465: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-06-12 16:22:10.830022: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-06-12 16:22:10.830519: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-06-12 16:22:10.830794: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node conv2d/Conv2D}}]]
Traceback (most recent call last):
  File "E:/Studium/Thesis/loss_test.py", line 35, in <module>
    test_mmd()
  File "E:/Studium/Thesis/loss_test.py", line 24, in test_mmd
    c_train_samples = feature_extractor.predict(train_data)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 909, in predict
    use_multiprocessing=use_multiprocessing)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training_generator.py", line 648, in predict
    use_multiprocessing=use_multiprocessing)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training_generator.py", line 265, in model_iteration
    batch_outs = batch_function(*batch_data)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training_generator.py", line 535, in predict_on_batch
    return model.predict_on_batch(x)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 1157, in predict_on_batch
    outputs = self.predict_function(inputs)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\keras\backend.py", line 3740, in __call__
    outputs = self._graph_fn(*converted_inputs)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\eager\function.py", line 1081, in __call__
    return self._call_impl(args, kwargs)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\eager\function.py", line 1121, in _call_impl
    return self._call_flat(args, self.captured_inputs, cancellation_manager)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\eager\function.py", line 1224, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\eager\function.py", line 511, in call
    ctx=ctx)
  File "C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node conv2d/Conv2D (defined at C:\ProgramData\Miniconda3\envs\TF_2G\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]] [Op:__inference_keras_scratch_graph_2495]

Function call stack:
keras_scratch_graph


Process finished with exit code 1

但是,如果我只是加载模型并直接使用它,例如:

model = load_model('checkpoints\\checkpoint.h5')
train_data = raw_preprocess(train_data)
pred =  model.predict(train_data)

然后一切都会正常工作,不会发生错误。
这可能是什么原因?我怎么解决这个问题?
我只是在运行时检查了信息,并看到了类似警告的内容:

2020-06-12 21:18:36.913051: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

我尝试重新加载原始模型并出现新问题:

WARNING:tensorflow:Error in loading the saved optimizer state. As a result, your model is starting with a freshly initialized optimizer.
WARNING:tensorflow:sample_weight modes were coerced from
  ...
    to  
  ['...']

这是我在加载模型时从未见过的新问题。什么可能导致问题以及如何解决这个问题?我试图建立一个新的 conda 环境,但它没有帮助。

标签: pythonpython-3.xtensorflowkerastensorflow2.0

解决方案


如果您使用的是 iPython(Spyder 或 Jupyter),那么如果您忘记使用 tensorflow 正确关闭旧内核,就会经常出现这种情况。


推荐阅读