python - Tensorflow 2.1 - make_csv_dataset - ValueError: Received a feature column from TensorFlow v1, but this is a TensorFlow v2 Estimator
问题描述
我很难弄清楚这里发生了什么(我看到大多数人都在试图弄清楚 TF 2.1)。下面是我的问题以及我已经尝试过的一些解决方案。
我正在尝试使用 AdaNet 通过从导入的 .csv 文件创建一个 tf.data.Dataset 来启动 TensorFlow Estimator 培训课程。我在跑:
Python 3.6
Windows 10
tensorflow==2.1.0
pandas==0.25.1
numpy==1.16.5
这个错误...:
ValueError: Received a feature column from TensorFlow v1, but this is a TensorFlow v2 Estimator. Please either use v2 feature columns (accessible via tf.feature_column.* in TF 2.x) with this Estimator, or switch to a v1 Estimator for use with v1 feature columns (accessible via tf.compat.v1.estimator.* and tf.compat.v1.feature_column.*, respectively.
...由此代码生成(很好地评论。我将其全部发布,因为我真的不知道是什么部分给了我这个错误。是的,获取我想在每个步骤中使用的列名列表是烦人,但这就是我现在保持它的方式):
import numpy as np
import pandas as pd
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import warnings
warnings.filterwarnings("once")
import adanet
import tensorflow as tf
from tensorflow.estimator import BinaryClassHead, MultiClassHead
# This will be a binary classification problem
head = BinaryClassHead()
# Import the dataset we're going to train with, just to get a list of the column names
# we want our estimator to reference
df = pd.read_csv('./datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv')
df = df.set_index(['Date'])
df['class'] = df['class'].astype('int32')
# Create a list of all the column names
feature_columns = list(df.columns)
# Remove the columns we aren't going to use during training
feature_columns.remove('Ticker')
feature_columns.remove('DailyChange')
feature_columns.remove('DailyHighChange')
feature_columns.remove('DailyLowChange')
# Adanet estimator
# Learn to ensemble linear and DNN models.
estimator = adanet.AutoEnsembleEstimator(
head=head,
candidate_pool=lambda config: {
"linear":
tf.estimator.LinearEstimator(
head=head,
feature_columns=feature_columns,
config=config,
optimizer='Adagrad'),
"dnn":
tf.estimator.DNNEstimator(
head=head,
feature_columns=feature_columns,
config=config,
optimizer='Adagrad',
hidden_units=[1000, 500, 100])},
max_iteration_steps=50)
# Input builders
# Define our train function called by the estimator during training to return
# a tf.data.Dataset (x, y) tuple
def input_fn_train():
# Do the same thing to collect a list of usable column names
df = pd.read_csv('./datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv')
df = df.set_index(['Date'])
df['class'] = df['class'].astype('int32')
feature_columns_list = list(df.columns)
feature_columns_list.remove('Ticker')
feature_columns_list.remove('DailyChange')
feature_columns_list.remove('DailyHighChange')
feature_columns_list.remove('DailyLowChange')
# Make our tf.data.Dataset from the same .csv file as before
df = tf.data.experimental.make_csv_dataset(
'./datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv',
batch_size=32,
label_name="class",
select_columns=feature_columns_list)
df_batches = (
df.cache().repeat().shuffle(500)
.prefetch(tf.data.experimental.AUTOTUNE))
return df_batches
# Get the estimator to train ...
estimator.train(input_fn=input_fn_train, steps=100)
因此,鉴于该错误,我在上面的代码中替换了tf.
with的每个实例tf.compat.v1.
,并得到了这个错误:
ValueError: Items of feature_columns must be a _FeatureColumn. Given (type <class 'str'>): Close_Resistance.
进行更多搜索后,我发现由于某种原因,每列都必须标记为数字列类型,因此我实现了这两个循环以将我的两个列名列表转换为数字类型(在恢复为tf.
而不是之后tf.compat.v1.
):
...
feature_columns.remove('DailyLowChange')
# Make all the feature columns numeric type for TF 2.1 for some reason
new_feature_list = []
for i in feature_columns:
new_feature_list.append(tf.feature_column.numeric_column(i))
# Adanet estimator
# Learn to ensemble linear and DNN models.
estimator = adanet.AutoEnsembleEstimator(
...
和
...
feature_columns_list.remove('DailyLowChange')
# Make all the feature columns numeric type for TF 2.1 for some reason
new_feature_columns_list = []
for i in feature_columns_list:
new_feature_columns_list.append(tf.feature_column.numeric_column(i))
# Make our tf.data.Dataset from the same .csv file as before
df = tf.data.experimental.make_csv_dataset(
...
...现在得到这个错误:
TypeError: not all arguments converted during string formatting
所以我不知道该怎么办。我想使用 TF 2.1 让这个东西工作,但我对失败感到沮丧。我在这篇文章中看到,有一个解决方案,但是我的 .csv 文件的列名太多,无法一次单独浏览一个并将每个定义为数字类型,所以无论有多少列,我都需要它是动态的正在加载。有人帮忙!谢谢。
解决方案
推荐阅读
- html - 移动对象时将水平滚动文本与垂直同步
- amazon-web-services - AWS S3 - 将一个账户拥有的文件复制到另一个账户的存储桶所有者中
- python - Tensorflow-自定义函数:ValueError:没有为任何变量提供梯度
- javascript - 如何使用 laravel foreach 循环在地图上显示多个标记?
- javascript - React axios 在 ios 上返回网络错误,但在桌面上没有
- go - 忽略来自渠道的价值
- python - 更改从 Pandas DataFrame 渲染的图像的显示大小
- reactjs - React 应用程序,为应用程序获取路由的 API 调用
- vba - 如何在多个系统中重复使用 Word 宏功能区?
- postgresql - Postgres如何通过夏令时将毫秒转换为日期