首页 > 解决方案 > Tensorflow 2.1 - make_csv_dataset - ValueError: Received a feature column from TensorFlow v1, but this is a TensorFlow v2 Estimator

问题描述

我很难弄清楚这里发生了什么(我看到大多数人都在试图弄清楚 TF 2.1)。下面是我的问题以及我已经尝试过的一些解决方案。

我正在尝试使用 AdaNet 通过从导入的 .csv 文件创建一个 tf.data.Dataset 来启动 TensorFlow Estimator 培训课程。我在跑:

Python 3.6
Windows 10
tensorflow==2.1.0
pandas==0.25.1
numpy==1.16.5

这个错误...:

ValueError: Received a feature column from TensorFlow v1, but this is a TensorFlow v2 Estimator. Please either use v2 feature columns (accessible via tf.feature_column.* in TF 2.x) with this Estimator, or switch to a v1 Estimator for use with v1 feature columns (accessible via tf.compat.v1.estimator.* and tf.compat.v1.feature_column.*, respectively.

...由此代码生成(很好地评论。我将其全部发布,因为我真的不知道是什么部分给了我这个错误。是的,获取我想在每个步骤中使用的列名列表是烦人,但这就是我现在保持它的方式):

import numpy as np
import pandas as pd
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 
import warnings
warnings.filterwarnings("once")
import adanet
import tensorflow as tf
from tensorflow.estimator import BinaryClassHead, MultiClassHead


# This will be a binary classification problem
head = BinaryClassHead()


# Import the dataset we're going to train with, just to get a list of the column names
# we want our estimator to reference
df = pd.read_csv('./datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv')
df = df.set_index(['Date'])
df['class'] = df['class'].astype('int32')

# Create a list of all the column names
feature_columns = list(df.columns)

# Remove the columns we aren't going to use during training
feature_columns.remove('Ticker')
feature_columns.remove('DailyChange')
feature_columns.remove('DailyHighChange')
feature_columns.remove('DailyLowChange')


# Adanet estimator
# Learn to ensemble linear and DNN models.
estimator = adanet.AutoEnsembleEstimator(
    head=head,
    candidate_pool=lambda config: {
        "linear":
            tf.estimator.LinearEstimator(
                head=head,
                feature_columns=feature_columns,
                config=config,
                optimizer='Adagrad'),
        "dnn":
            tf.estimator.DNNEstimator(
                head=head,
                feature_columns=feature_columns,
                config=config,
                optimizer='Adagrad',
                hidden_units=[1000, 500, 100])},
    max_iteration_steps=50)


# Input builders
# Define our train function called by the estimator during training to return
# a tf.data.Dataset (x, y) tuple
def input_fn_train():
    # Do the same thing to collect a list of usable column names
    df = pd.read_csv('./datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv')
    df = df.set_index(['Date'])
    df['class'] = df['class'].astype('int32')
    feature_columns_list = list(df.columns)
    feature_columns_list.remove('Ticker')
    feature_columns_list.remove('DailyChange')
    feature_columns_list.remove('DailyHighChange')
    feature_columns_list.remove('DailyLowChange')

    # Make our tf.data.Dataset from the same .csv file as before
    df = tf.data.experimental.make_csv_dataset(
      './datasets/call_restored_df_' + str(4) + '_' + 'SPY' + '.csv',
      batch_size=32,
      label_name="class",
      select_columns=feature_columns_list)

    df_batches = (
      df.cache().repeat().shuffle(500)
      .prefetch(tf.data.experimental.AUTOTUNE))
    return df_batches

# Get the estimator to train ...
estimator.train(input_fn=input_fn_train, steps=100)

因此,鉴于该错误,我在上面的代码中替换了tf.with的每个实例tf.compat.v1.,并得到了这个错误:

ValueError: Items of feature_columns must be a _FeatureColumn. Given (type <class 'str'>): Close_Resistance.

进行更多搜索后,我发现由于某种原因,每列都必须标记为数字列类型,因此我实现了这两个循环以将我的两个列名列表转换为数字类型(在恢复为tf.而不是之后tf.compat.v1.):

...
feature_columns.remove('DailyLowChange')


# Make all the feature columns numeric type for TF 2.1 for some reason
new_feature_list = []
for i in feature_columns:
    new_feature_list.append(tf.feature_column.numeric_column(i))

# Adanet estimator
# Learn to ensemble linear and DNN models.
estimator = adanet.AutoEnsembleEstimator(
...

...
feature_columns_list.remove('DailyLowChange')

# Make all the feature columns numeric type for TF 2.1 for some reason
new_feature_columns_list = []
for i in feature_columns_list:
    new_feature_columns_list.append(tf.feature_column.numeric_column(i))

# Make our tf.data.Dataset from the same .csv file as before
df = tf.data.experimental.make_csv_dataset(
...

...现在得到这个错误:

TypeError: not all arguments converted during string formatting

所以我不知道该怎么办。我想使用 TF 2.1 让这个东西工作,但我对失败感到沮丧。我在这篇文章中看到,有一个解决方案,但是我的 .csv 文件的列名太多,无法一次单独浏览一个并将每个定义为数字类型,所以无论有多少列,我都需要它是动态的正在加载。有人帮忙!谢谢。

标签: pythonpython-3.xtensorflowadanet

解决方案


推荐阅读