tensorflow - 如何将 tensorflow sequence_numeric_column 与 RNNClassifier 一起使用?
问题描述
我正在寻找 tensorflow contrib API,我想使用Tensorflow 1.13 提供的 RNNClassifier。与非序列估计器相反,这个估计器只需要序列特征列。但是我无法让它在玩具数据集上工作。我在使用sequence_numeric_column时不断收到错误消息。
这是我的玩具数据集的结构:
idSeq,kind,label,size
0,0,dwarf,117.6
0,0,dwarf,134.4
0,0,dwarf,119.0
0,1,human,168.0
0,1,human,145.25
0,2,elve,153.9
0,2,elve,218.49999999999997
0,2,elve,210.9
1,0,dwarf,166.6
1,0,dwarf,168.0
1,0,dwarf,131.6
1,1,human,150.5
1,1,human,208.25
1,1,human,210.0
1,2,elve,199.5
1,2,elve,161.5
1,2,elve,197.6
其中 idSeq 允许我们查看哪些行属于哪个序列。由于“大小”列,我想预测“种类”列。
下面是关于在我的数据集上进行 RNN 训练的代码。
import numpy as np
import pandas as pd
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.logging.set_verbosity(tf.logging.INFO)
dataframe = pd.read_csv("data_rnn.csv")
dataframe_test = pd.read_csv("data_rnn_test.csv")
train_x = dataframe
train_y = dataframe.loc[:,(["kind"])]
size_feature_col = tf.contrib.feature_column.sequence_numeric_column('size ')
estimator = tf.contrib.estimator.RNNClassifier(
sequence_feature_columns=[size_feature_col ],
num_units=[32, 16],
cell_type='lstm',
model_dir=None,
n_classes=3,
optimizer='Adagrad'
)
def make_dataset(
batch_size,
x,
y=None,
shuffle=False,
shuffle_buffer_size=1000,
shuffle_seed=1):
"""
An input function for training, evaluation or prediction.
Parameters
----------------------
batch_size: integer
the size of the batch to use for the training of the neural network
x: pandas dataframe
dataframe that contains the features of the samples to study
y: pandas dataframe or array (Default: None)
dataframe or array that contains the values to predict of the samples
to study. If none, we want a dataset for evaluation or prediction.
shuffle: boolean (Default: False)
if True, we shuffle the elements of the dataset
shuffle_buffer_size: integer (Default: 1000)
if we shuffle the elements of the dataset, it is the size of the buffer
used for it.
shuffle_seed : integer
the random seed for the shuffling
Returns
---------------------
dataset.make_one_shot_iterator().get_next(): Tensor
a nested structure of tf.Tensors containing the next element of the
dataset to study
"""
def input_fn():
if y is not None:
dataset = tf.data.Dataset.from_tensor_slices((dict(x), y))
else:
dataset = tf.data.Dataset.from_tensor_slices(dict(x))
if shuffle:
dataset = dataset.shuffle(
buffer_size=shuffle_buffer_size,
seed=shuffle_seed).batch(batch_size).repeat()
else:
dataset = dataset.batch(batch_size)
return dataset.make_one_shot_iterator().get_next()
return input_fn
batch_size = 50
random_seed = 1
input_fn_train = make_dataset(
batch_size=batch_size,
x=train_x,
y=train_y,
shuffle=True,
shuffle_buffer_size=len(train_x),
shuffle_seed=random_seed)
estimator.train(input_fn=input_fn_train, steps=5000)
但我只收到以下错误:
INFO:tensorflow:Calling model_fn.
Traceback (most recent call last):
File "main.py", line 125, in <module>
estimator.train(input_fn=input_fn_train, steps=5000)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1154, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 512, in _model_fn
config=config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 332, in _rnn_model_fn
logits, sequence_length_mask = logit_fn(features=features, mode=mode)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 226, in rnn_logit_fn
features=features, feature_columns=sequence_feature_columns)
File "/root/.local/lib/python3.5/site-packages/tensorflow/contrib/feature_column/python/feature_column/sequence_feature_column.py", line 120, in sequence_input_layer
trainable=trainable)
File "/root/.local/lib/python3.5/site-packages/tensorflow/contrib/feature_column/python/feature_column/sequence_feature_column.py", line 496, in _get_sequence_dense_tensor
sp_tensor, default_value=self.default_value)
File "/root/.local/lib/python3.5/site-packages/tensorflow/python/ops/sparse_ops.py", line 1432, in sparse_tensor_to_dense
sp_input = _convert_to_sparse_tensor(sp_input)
File "/root/.local/lib/python3.5/site-packages/tensorflow/python/ops/sparse_ops.py", line 68, in _convert_to_sparse_tensor
raise TypeError("Input must be a SparseTensor.")
TypeError: Input must be a SparseTensor.
所以我不明白我做错了什么,因为在文档中,写着我们必须给 RNNEstimator 一个序列列。他们没有说任何关于给出稀疏张量的事情。
提前感谢您的帮助和建议。
解决方案
推荐阅读
- c# - 如何识别您来自使用 ASP.NET Core MVC 的 view.cshtml?
- powershell - 需要 powershell 命令将多个“团队呼叫队列”数据导出到 .csv 文件
- ubuntu-18.04 - 在ubuntu的shell脚本中以非交互方式设置新用户的密码
- compilation - 在 C++ Builder 中隐藏控制台窗口
- vba - MS 访问 VBA 更新创建重复条目
- cmake - 为什么断言 CPACK_DEBIAN_DEBUGINFO_PACKAGE 不会生成 .ddeb 文件?
- python - 使用状态机在 pygame 中获得连续运动
- typescript - 如何在 TypeScript 中定义具有已定义属性和索引“后备”的类型
- javascript - 用道具反应问题:TypeError:无法读取未定义的属性“xxxx”
- amazon-web-services - VPC 中的 Lambda 无法访问 DigitalOcean S3 资源