首页 > 解决方案 > Unable to release GPU memory after training a CNN model using keras

问题描述

On a Google Colab notebook with keras(2.2.4) and tensorflow(1.13.1) as a backend, I am trying to tune a CNN, I use a simple and basic table of hyper-parameters and run my tests in a set of loops. My problem is that I can't free the GPU memory after each iteration and Keras doesn't seem to be able to release GPU memory automatically. So every time I get a Ressource Exhausted : Out Of Memory (OOM) error I did some digging up and run into this function that reassembles different solutions that have been suggested to solve this problem ( didn't work for me though)

for _ in hyper_parameters : 
    Run_model(_)
    reset_keras()

my set of hyper parameters being :

IMG_SIZE = 800,1000,3
BATCH_SIZEs = [4,8,16]
EPOCHSs = [5,10,50]
LRs = [0.008,0.01]
MOMENTUMs = [0.04,0.09]
DECAYs = [0.1]
VAL_SPLITs = [0.1]

and the function I used to free the GPU memory :

def reset_keras():
    sess = get_session()
    clear_session()
    sess.close()
    sess = get_session()

    try:
        del model # this is from global space - change this as you need
    except:
        pass

    print(gc.collect()) # if it's done something you should see a number being outputted

    # use the same config as you used to create the session
    config = tf.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
    set_session(tf.Session(config=config))

The only thing that I didn't fully grasp is the "same config as you used to create your model "(line 14) since with Keras we don't chose explicitly a certain configuration. I get by for one iteration, some times two, but I can't go beyond. I already tried to change the batch_size and for the moment I am unable to afford for a machine with higher performances. I am using a custom image generator that inherits from keras.utils.Sequence. I monitor the state of GPU memory using this piece of code :

import psutil
import GPUtil as GPU

def printmm():
    GPUs = GPU.getGPUs() 
    gpu = GPUs[0]
    process = psutil.Process(os.getpid())
    return(" Util {0:.2f}% ".format(gpu.memoryUtil*100))

print(printmm())

标签: performancetensorflowkerasgpuconv-neural-network

解决方案


推荐阅读