tensorflow - Optimising GPU use for Keras model training
问题描述
I'm training a Keras model. During the training, I'm only utilising between 5 and 20% of my CUDA cores and an equally small proportion of my NVIDIA RTX 2070 memory. Model training is pretty slow currently and I would really like to take advantage of as many of my available CUDA cores as possible to speed this up!
nvidia dmon # (during model training)
# gpu pwr gtemp mtemp sm mem enc dec mclk pclk
# Idx W C C % % % % MHz MHz
0 45 49 - 9 6 0 0 6801 1605
What parameters should I look to tune in order to increase CUDA core utilisation with the aim of training the same model faster?
Here's a simplified example of my current image generation and training steps (I can elaborate / edit, if required, but I currently believe these are the key steps for the purpose of the question):
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
r'./input_training_examples',
target_size=(150, 150),
batch_size=32,
class_mode='binary'
)
validation_generator = test_datagen.flow_from_directory(
r'./input_validation_examples',
target_size=(150, 150),
batch_size=32,
class_mode='binary'
)
history = model.fit(
train_generator,
steps_per_epoch=128, epochs=30,
validation_data=validation_generator, validation_steps=50,
)
Hardware: NVIDIA 2070 GPU
Platform: Linux 5.4.0-29-generic #33-Ubuntu x86_64, NVIDIA driver 440.64, CUDA 10.2, Tensorflow 2.2.0-rc3
解决方案
GPU 利用率是一项棘手的工作,涉及的因素太多。
显然要尝试的第一件事:增加批量大小。
但这并不能确保最大利用率,也许您的 I/O 很慢,因此 data_generator 存在瓶颈。
NumPy
如果您有足够的 ram 内存,您可以尝试将完整数据加载为数组。
您可以尝试在多处理方案中增加工人数量。
model.fit(..., use_multiprocessing=True, workers=8)
最后,取决于您的模型,如果您的模型太轻且不够深,则您的利用率将很低,并且没有进一步改进它的标准方法。
推荐阅读
- angular - 如何将 HighCharts 列划分为不同的类别?
- excel - 在 VBA 中编译时出错“未定义集合定义的类型”
- google-chrome - 为什么 tf.memory 显示的内存使用情况与 Chrome 任务管理器如此不同?
- c++ - 如何在 Microsoft Visual Studio 2017 中检查 C++ 版本
- java - 为什么 PriorityQueue 没有带有 Collection 和 Comparator 参数的构造函数?
- python-3.x - 由错误的函数处理的异常?(Python 3.x)
- android - 在 Android 中将 drawableEnd 添加到 SwitchCompat
- node.js - 重用全局数据库池/连接
- excel - Best Fit 包 - VBA 请求
- javascript - 如何在整个会话中保持 select 的值