python - Memory error while training my model: Unable to allocate 31.9 GiB for an array with shape (3094, 720, 1280, 3) and data type float32
问题描述
So, I am providing labels to my images as "0" and "1" based on the presence of a human. When I pass all my images and try to train my model. I get a memory error.
import warnings
warnings.filterwarnings('ignore')
import tensorflow as to
import tensorflow.keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ReduceLROnPlateau, CSVLogger, EarlyStopping
from tensorflow.keras.models import Model
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from tensorflow.keras.applications.resnet50 import ResNet50
from PIL import Image
import os
import numpy as np
train_x=[]
train_y=[]
for path in os.listdir('C:\\Users\\maini_\\Desktop\\TestAndTrain\\in\\train'):
img = Image.open('C:\\Users\\maini_\\Desktop\\TestAndTrain\\in\\train\\'+path)
train_x.append(np.array(img))
train_y.append(1)
img.close()
for path in os.listdir('C:\\Users\\maini_\\Desktop\\TestAndTrain\\notin\\train'):
img = Image.open('C:\\Users\\maini_\\Desktop\\TestAndTrain\\notin\\train\\'+path)
train_x.append(np.array(img))
train_y.append(0)
img.close()
print("done" )
train_x = np.array(train_x)
train_x = train_x.astype(np.float32)
train_x /= 255.0
train_y = np.array(train_y)
I am working with
- the Jupyter notebook version:6.0.3
- python version: 3.7
- Anaconda version: 4.8.3
解决方案
You've tried to pass 3094
images of size 720x1280
into your model as one singular batch, resulting in a total 31.9GB worth of data. Your GPU is overloaded and cannot physically store and process all that data at one time, you need to use batches.
Since you will run into trouble every time you try to process the data, I recommend using ImageDataGenerator()
and flow_from_directory()
which will load the pictures for training automagically.
An ideal way to set this up is as follows
train_datagen = ImageDataGenerator(
rescale=1./255,
validation_split=0.3) #Splits the data 70/30 for training and validation
train_generator = train_datagen.flow_from_directory(
train_data_dir,
color_mode='grayscale',
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=True,
subset='training')
validation_generator = train_datagen.flow_from_directory(
train_data_dir,
color_mode='grayscale',
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=True,
subset='validation')
Then to fit the model you would need to call the model.fit_generator()
class
model.fit_generator(train_generator, epochs=epochs, callbacks=callbacks, validation_data=validation_generator)
This is the best way to deal with mass amounts of images when training models in Keras as the data is generated (or flowed) from the directory per batch, rather than manually loading and whatnot in Python. The only caveat is the directory setup is slightly different to what you currently have. You will need to change the directory setup to
TestAndTrain
-Train
-in
-notin
-Test
-in
-notin
推荐阅读
- swift - 如何在 webview 中打开重定向页面?
- php - 如何使用 Laravel 5.8 在电子邮件中嵌入内联图像
- java - 我们如何存储列表
像 Iterator 这样的 Iterator 中的值 - >?
- html - 在不使用 Jquery 的情况下覆盖字段集禁用属性(仅在 HTML 中)
- c++ - Raspberry PI 3 - 内核驱动程序 - 无法使用 ioremap() 访问 GPIO
- php - 如何使用 Symfony Webpack Encore 让 jQuery 在外部工作?
- r - 通过匹配相似模式的两列来分隔行
- ftp-client - 如何在 Arduino C++ 的一个会话中将多个文件上传到 ftp
- wordpress - 如何创建指向灯箱画廊弹出窗口的链接?
- c++ - while循环完成后如何清除变量