首页 > 解决方案 > 如何修复 Keras 中的“分段错误(核心转储”)错误

问题描述

我在使用 Keras 时遇到问题。基本上,当我尝试使用 conv2d 层拟合模型时,它会给我以下错误“分段错误(核心转储)”。

我的代码在 CPU 上运行。它也可以在没有任何 conv2d 层的情况下工作(即使它对我的用例无效)。我已经安装了 cuda、cudnn 和 tensorflow。我试过重新安装 keras 和 tensorflow。

代码:

def model_build():
    model = Sequential()
    model.add(Conv2D(input_shape = (env_size()[0], env_size()[1], 1), filters=4, kernel_size=(3,3), strides=1, activation=swisher))
    model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
    model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
    model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
    model.add(Flatten())
    model.add(Dense(128, activation='softmax'))
    model.add(Dense(4, activation='softmax'))
    return model

if __name__ == '__main__':
    y = model_build()
    y.compile(loss = "mean_squared_error", optimizer = 'adam')
    y.fit(x=env(), y = np.array([[0,0,0,0]])

错误:

Using TensorFlow backend.
Epoch 1/1
2019-03-27 05:52:27.687323: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-27 05:52:27.789975: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-27 05:52:27.790819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.83
pciBusID: 0000:01:00.0
totalMemory: 5.73GiB freeMemory: 5.40GiB
2019-03-27 05:52:27.790834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2019-03-27 05:52:28.068080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-27 05:52:28.068115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]      0
2019-03-27 05:52:28.068121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0:   N
2019-03-27 05:52:28.068487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5147 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
2019-03-27 05:52:28.177752: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.337277: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.500486: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.586280: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.675738: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
Segmentation fault (core dumped)

编辑:

独立的例子。

import numpy as np
import keras

model = keras.models.Sequential() #Sequential model type.
model.add(keras.layers.Conv2D(filters=1, kernel_size=(3,3), strides = 1, activation="sigmoid")) #Convolutional layer.
model.add(keras.layers.Flatten()) #Flatten layer.
model.add(keras.layers.Dense(4)) #Dense layer of 4 units.
model.compile(loss='mean_squared_error', optimizer='adam') #compile model.
y = np.random.rand(1,4) #Random expected output
x = np.random.rand(1, 38, 21, 1) # Random input.
model.fit(x, y) #And fit...

编辑2:

Keras 版本:'v2.1.6-tf'

TensorFlow-GPU 版本:'v1.12'

Python 版本:'v3.5.2'

CUDA 版本:'v9.0.176'

CUDNN 版本:'v7.2.1.38-1+cuda9.0

Ubuntu 版本:'v16.04'

标签: pythontensorflowkeras

解决方案


您的 GPU 似乎没有足够的内存。您的模型似乎不太大,所以我猜问题出在以下行:

y.fit(x=env(), y = np.array([[0,0,0,0]])

的输出env()可能太大而无法由您的 GPU 内存处理。


推荐阅读