python - AttributeError:“tensorflow.python.framework.ops.EagerTensor”对象没有属性“to_tensor”
问题描述
我正在使用 Hugging Face、Keras、Tensorflow 库对 BERT 模型进行微调。
从昨天开始,我在 Google Colab 中运行我的代码时遇到了这个错误。奇怪的是,以前运行的代码没有任何问题,突然开始抛出这个错误。更令人怀疑的是,代码在我的 Apple M1 tensorflow 配置中运行没有问题。同样,我没有对我的代码进行任何更改,但现在代码无法在 Google Colab 中运行,尽管它过去运行时没有任何问题。
两种环境都有 tensorflow 2.6.0
我创建了下面的代码以重现错误。我希望你能对此有所了解。
!pip install transformers
!pip install datasets
import pandas as pd
import numpy as np
import tensorflow as tf
from transformers import AutoTokenizer
from datasets import Dataset
# dummy sentences
sentences = ['the house is blue and big', 'this is fun stuff','what a horrible thing to say']
# create a pandas dataframe and converto to Hugging Face dataset
df = pd.DataFrame({'Text': sentences})
dataset = Dataset.from_pandas(df)
#download bert tokenizer
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
# tokenize each sentence in dataset
dataset_tok = dataset.map(lambda x: tokenizer(x['Text'], truncation=True, padding=True, max_length=10), batched=True)
# remove original text column and set format
dataset_tok = dataset_tok.remove_columns(['Text']).with_format('tensorflow')
# extract features
features = {x: dataset_tok[x].to_tensor() for x in tokenizer.model_input_names}
解决方案
删除to_tensor()
给定代码后,按照@Harold G 的建议工作。
!pip install transformers
!pip install datasets
import pandas as pd
import numpy as np
import tensorflow as tf
from transformers import AutoTokenizer
from datasets import Dataset
# dummy sentences
sentences = ['the house is blue and big', 'this is fun stuff','what a horrible thing to say']
# create a pandas dataframe and converto to Hugging Face dataset
df = pd.DataFrame({'Text': sentences})
dataset = Dataset.from_pandas(df)
#download bert tokenizer
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
# tokenize each sentence in dataset
dataset_tok = dataset.map(lambda x: tokenizer(x['Text'], truncation=True, padding=True, max_length=10), batched=True)
# remove original text column and set format
dataset_tok = dataset_tok.remove_columns(['Text']).with_format('tensorflow')
# extract features
features = {x: dataset_tok[x] for x in tokenizer.model_input_names}
推荐阅读
- c# - 如果 ON HEAP 和 OFF HEAP Memory full APACHE IGNITE 将条目逐出到磁盘
- excel - 如何修复在 excel vba 中发送的“Next with no For”错误
- sql - sql:比较具有不同值的两个表会产生重复的结果
- c++ - C++ 中的定期线程创建
- selenium - selenium element.click() 不工作(不点击)
- html - 如何使用css在所有方向上旋转图像
- jquery - 状态没有使用反应钩子更新?
- java - 将 csv 文件的特定列与特定 POJO 的字段映射
- r - Igraph - 一种提取哪些节点进入哪些社区的方法
- spring-boot - 切换到 http/2 后 ClientAbortException 增加