首页 > 解决方案 > keras在拟合模型时是否深度复制数据?

问题描述

当我运行我的模型(用于图像分割的 Unet)时,我有 ram 内存错误弹出:

2020-11-19 11:25:18.027748: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 11998593024 exceeds 10% of free system memory.
2020-11-19 11:25:32.991088: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 11998593024 exceeds 10% of free system memory.
2020-11-19 11:25:46.109554: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 11998593024 exceeds 10% of free system memory.

分配的内存图:

在此处输入图像描述

我想知道 tensorflow 是否在深度复制数据,如果是这样,有没有办法避免它(不使用 DataGenerator)。

主脚本:

from data_preprocessing import data_utils,DataGenerator
from model import model_utils,loss_utils
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from sklearn.model_selection import train_test_split
import tensorflow as tf

if __name__ == "__main__":
    X,Y = data_utils.load_all()
    print("Checkpoint 1")
    strategy = tf.distribute.MirroredStrategy()
    with strategy.scope():
        Xtrain,Xtest,Ytrain,Ytest = train_test_split(X,Y, test_size = 1/5, shuffle = True)
        print("Checkpoint 2")
        unet = model_utils.unet(input_size=(256,256,1))
        print("Checkpoint 3")
        checkpointer = ModelCheckpoint('image_segm.hdf5',monitor='loss',verbose=1,save_best_only=True)
        historic = unet.fit(Xtrain,Ytrain,epochs=1,callbacks=[checkpointer],batch_size= 5)
        print("End")

编辑:在 conda 环境中使用 tensorflow-gpu 2.20.0

标签: pythontensorflowmemorykerasdeep-learning

解决方案


查看这篇文章,它将帮助您解决 Datagen 问题https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly


推荐阅读