首页 > 解决方案 > RuntimeError: __iter__() 仅在 tf.function 内部或启用急切执行时受支持。(张量流 2.3.0)

问题描述

我正在尝试实现一种基于字符的 seq2seq 方法来进行文本规范化,类似于使用翻译的方法。

我是 tensorflow 和 seq2seq 学习的新手。我已经阅读了 stackoverflow 和其他网站上的各种帖子。我试图添加@tf.function 并将“next(iter(dataset))”包装在一个函数中,但我迷路了。

鉴于大学集群中的用户权限限制,我无法升级到 TF 2.4,也无法降级到 2.0 以下。

非常感谢您的帮助。

这是错误和错误代码:

Traceback (most recent call last):
  File "/home/students/boye/EnAtt2.py", line 118, in <module>
    example_input_batch, example_target_batch = next(iter(dataset))
  File "/home/students/boye/ende/lib/python3.8/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 417, in __iter__
    raise RuntimeError("__iter__() is only supported inside of tf.function "
RuntimeError: __iter__() is only supported inside of tf.function or when eager execution is enabled.

这是我的整个代码:

import os
import time
import numpy as np
from sklearn.model_selection import train_test_split
import tensorflow as tf


config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.allocator_type = 'BFC'
with tf.compat.v1.Session(config=config) as s:

    path_to_file ="fra.txt"


    def create_dataset(path):

        input_texts = []
        target_texts = []

        with open(path, "r", encoding="utf-8") as f:
            lines = f.read().split("\n")

        for line in lines[: min(len(lines), len(lines) - 1)]:
            input_text, target_text, _ = line.split("\t")
            # '\t' is the "start sequence" '\n' is the "end sequence" character
            input_text = '<\t>' + input_text + '<\n>'
            target_text = '<\t>' + target_text + '<\n>'
            input_texts.append(input_text)
            target_texts.append(target_text)

        return input_texts, target_texts

    # Convert sequences to tokenizers
    def tokenize(lang):
        lang_tokenizer = tf.keras.preprocessing.text.Tokenizer(
            filters='', char_level=True)

        # Convert sequences into internal vocab
        lang_tokenizer.fit_on_texts(lang)

        # Convert internal vocab to numbers
        tensor = lang_tokenizer.texts_to_sequences(lang)

        # Pad the tensors to assign equal length to all the sequences
        tensor = tf.keras.preprocessing.sequence.pad_sequences(tensor,
                                                               padding='post', dtype="int16")

        return tensor, lang_tokenizer


    # Load the dataset
    def load_dataset(path):
        # Create dataset (targ_lan = English, inp_lang = French)
        inp_lang, targ_lang = create_dataset(path)

        # Tokenize the sequences
        input_tensor, inp_lang_tokenizer = tokenize(inp_lang)
        print(input_tensor)
        target_tensor, targ_lang_tokenizer = tokenize(targ_lang)

        return input_tensor, target_tensor, inp_lang_tokenizer, targ_lang_tokenizer


    # Consider 50k examples
    num_examples = 50000
    input_tensor, target_tensor, inp_lang, targ_lang = load_dataset(path_to_file)

    # Calculate max_length of the target tensors
    max_length_targ, max_length_inp = target_tensor.shape[1], input_tensor.shape[1]

    # Create training and validation sets using an 80/20 split
    input_tensor_train, input_tensor_val, target_tensor_train, target_tensor_val = train_test_split(input_tensor, target_tensor, test_size=0.2)


    # Show the mapping b/w word index and language tokenizer
    def convert(lang, tensor):
        for t in tensor:
            if t != 0:
                print("%d ----> %s" % (t, lang.index_word[t]))


    print("Input Language; index to word mapping")
    convert(inp_lang, input_tensor_train[0])
    print()
    print("Target Language; index to word mapping")
    convert(targ_lang, target_tensor_train[0])

    # Essential model parameters
    BUFFER_SIZE = len(input_tensor_train)
    BATCH_SIZE = 1
    steps_per_epoch = len(input_tensor_train)//BATCH_SIZE
    embedding_dim = 256
    units = 256
    vocab_inp_size = len(inp_lang.word_index) + 1
    vocab_tar_size = len(targ_lang.word_index) + 1
    print(vocab_inp_size)
    dataset = tf.data.Dataset.from_tensor_slices((input_tensor_train, target_tensor_train)).shuffle(BUFFER_SIZE)
    dataset = dataset.batch(BATCH_SIZE, drop_remainder=True)

    **example_input_batch, example_target_batch = next(iter(dataset))**

最初我收到此错误:

Traceback (most recent call last):
  File "/home/students/boye/EnAtt2.py", line 320, in <module>
    batch_loss = train_step(inp, targ, enc_hidden)
  File "/home/students/boye/ende/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/home/students/boye/ende/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 644, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/students/boye/ende/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/students/boye/ende/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1661, in _filtered_call
    return self._call_flat(
  File "/home/students/boye/ende/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1745, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/students/boye/ende/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 593, in call
    outputs = execute.execute(
  File "/home/students/boye/ende/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.ResourceExhaustedError:  [_Derived_]  OOM when allocating tensor with shape[7084032] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
     [[{{node concat_0}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[decoder_303/gru_1/StatefulPartitionedCall]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[gradient_tape/encoder/embedding/embedding_lookup/Reshape/_3324]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
 [Op:__inference_train_step_604161]

Function call stack:
train_step -> train_step -> train_step

减少 seq2seq 模型的 batch_size 和单位(/维度)不起作用。我试图通过在顶部包含这些行来消除 OOM 错误:

config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.allocator_type = 'BFC'
with tf.compat.v1.Session(config=config) as s:

这就是为什么我不想在 eager_mode 中运行我的代码或获取 eager_tensors。

标签: pythontensorflow

解决方案


推荐阅读