首页 > 解决方案 > 为什么 model.fit 在单个时期内调用 __getitem__ (no_steps_in_epoch + 1) 次?

问题描述

我想了解为什么当我使用自定义数据生成器调用 model.fit 时,方法 __getitem__ 被调用 (DataGenerator.__len__() + 1) 次而不是 DataGenerator.__len__() 次,据我了解,DataGenerator.__len ()__ 表示单个 epoch 中的步数。

在我的真实代码中,我有一个方法 (__data_generation()),它假设每个 epoch __getitem__ 都被称为no_steps_in_a_epoch次(它不使用批处理索引有几个原因......)

import tensorflow.keras as keras
import numpy as np
class DataGenerator(keras.utils.Sequence):
    def __init__(self):
        self.counter = 0

    def __len__(self):
        "Denotes the number of batches per epoch"
        return 10

    def __getitem__(self, batch_index):
        "Generate one batch of data"
        self.counter += 1
        print("step:", self.counter, "batch index:", batch_index)
        return np.array([[1,2,3],[2,3,4]]), np.array([[1],[1]])
        # return self.__data_generation()

    # something else ...

# main
generator = DataGenerator()
model = keras.Sequential()
model.add(keras.layers.InputLayer(input_shape=(3,)))
model.add(keras.layers.Dense(2, activation="relu"))
model.add(keras.layers.Dense(1, activation="relu"))
model.compile(optimizer=keras.optimizers.Adam(), loss=keras.losses.MeanAbsoluteError())
model.fit(x=generator, epochs=1)

控制台输出

2021-06-04 15:51:53.799143: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-06-04 15:51:53.799623: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-06-04 15:51:58.849246: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2021-06-04 15:51:58.849511: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-06-04 15:51:58.868907: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: <FOOOOOOOOO>
2021-06-04 15:51:58.869366: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: <FOOOOOOOOO>
2021-06-04 15:51:58.870424: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
step: 1 batch index: 0
2021-06-04 15:51:59.016093: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
step: 2 batch index: 2
step: 3 batch index: 5
 1/10 [==>...........................] - ETA: 6s - loss: 0.9147step: 4 batch index: 7
step: 5 batch index: 3
step: 6 batch index: 0
step: 7 batch index: 1
step: 8 batch index: 8
step: 9 batch index: 4
step: 10 batch index: 6
step: 11 batch index: 9
10/10 [==============================] - 1s 2ms/step - loss: 0.9001

标签: pythontensorflowkerasdeep-learning

解决方案


我认为“加一”与名为 TRACING https://www.tensorflow.org/guide/function#tracing的操作有关,TensorFlow 在必要时使用跟踪来构建图形。Model.fit() 方法在训练前构建图,模型的第一个循环只是用来构建图,然后模型开始在整个数据集或数据生成器上循环。


推荐阅读