首页 > 解决方案 > Batch training in Keras LSTM

问题描述

If I use a batch_size of 32 in an LSTM made with Keras, is the loss function applied to each sequence and then averaged, or is it applied directly to all sequences without taking each sequence into account?

Thanks in advance.

标签: keraslstmloss

解决方案


Since a batch_size of one would imply updating the weights after a sequence, a batch size of 32 would mean updating the weights after those 32 sequences.

So the weights are updated only after this chunk of 32 sequences, with the loss as average on all of those, since otherwise if the loss would be updated to each one in itself, it would actually represent the plain SGD with batch_size = 1.


推荐阅读