python - 我错过了什么吗?TensorFlow中的简单分类器输入函数出错
问题描述
我一直在关注 TensorFlow 上的 freecodecamp 教程,并尝试修改基本分类器来处理我自己的结构化数据集之一。
我有一个训练数据集和一个测试数据集,每个数据集都包含一些整数和一些字符串。我正在尝试预测已分配列中的值,但是在调用 Classifier.train 方法时它会不断抛出此错误:
UnimplementedError: Cast string to float is not supported
[[{{node head/losses/Cast}}]]
During handling of the above exception, another exception occurred:
UnimplementedError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1392 '\nsession_config.graph_options.rewrite_options.'
1393 'disable_meta_optimizer = True')
-> 1394 raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
1395
1396 def _extend_graph(self):
UnimplementedError: Cast string to float is not supported
[[node head/losses/Cast (defined at /usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/head/binary_class_head.py:255) ]]
我尝试转换数据集,以便所有值都是整数或浮点数,但我不断收到相同的错误。据我所知,分类器应该能够对不同的数据类型进行操作,所以除非我需要在某处定义它们,否则我不明白为什么会出现问题?
我知道它正在正确读取数据,因为当我使用 .head() 函数时,它的格式都正确。我已经被这个错误困住了好几天,我无法弄清楚我错过了什么。任何帮助将不胜感激。我的代码如下。
%tensorflow_version 2.x
from __future__ import absolute_import, division, print_function, unicode_literals
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import clear_output
from six.moves import urllib
import tensorflow.compat.v2.feature_column as fc
import tensorflow as tf
CSV_COLUMN_NAMES = ['GroupNumber', 'GroupUnit', 'GroupSkill1', 'GroupSkill2', 'GroupSkill3', 'GroupSkill4', 'GroupPreference1',
'GroupPreference2', 'GroupPreference3', 'ProjectNumber', 'ProjectUnit', 'ProjectSkill1', 'ProjectSkill2', 'ProjectSkill3', 'ProjectSkill4', 'ProjectPreference1', 'ProjectPreference2', 'ProjectPreference3', 'Allocated']
ALLOCATED = [0, 1]
train = pd.read_csv('https://raw.githubusercontent.com/nickjackson862/machine-learning/main/trainData40_10.csv', names=CSV_COLUMN_NAMES, header=0)
test = pd.read_csv('https://raw.githubusercontent.com/nickjackson862/machine-learning/main/testData40_10.csv', names=CSV_COLUMN_NAMES, header=0)
train_y = train.pop('Allocated')
test_y = test.pop('Allocated')
train.head()
def input_fn(features, labels, training=True, batch_size=100):
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
if training:
dataset = dataset.shuffle(10).repeat()
return dataset.batch(batch_size)
my_feature_columns = []
for key in train.keys():
my_feature_columns.append(tf.feature_column.numeric_column(key=key))
classifier = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
hidden_units=[50, 20],
n_classes=2)
classifier.train(
input_fn=lambda: input_fn(train, train_y, training=True),
steps=100)
eval_result = classifier.evaluate(
input_fn=lambda: input_fn(test, test_y, training=False))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
解决方案
我在您创建功能列的这一行中发现了问题。
my_feature_columns.append(tf.feature_column.numeric_column(key=key))
您正在使每个特征都成为数字特征,但是查看数据集的几个字段是字符串(顺便说一句,CSV 文件是公共的,您可能需要对此进行补救)。
我尝试转换数据集,以便所有值都是整数或浮点数,但我不断收到相同的错误。
我相信你做错了。我刚刚尝试运行您的代码,但删除了所有字符串类型列,并且它成功运行且没有错误。我所做的只是在读取 CSV 后添加以下行
train.drop(columns=['GroupSkill1', 'GroupSkill2', 'GroupSkill3', 'GroupSkill4', "ProjectSkill1", "ProjectSkill2", "ProjectSkill3", "ProjectSkill4", ], axis=1, inplace=True)
test.drop(columns=['GroupSkill1', 'GroupSkill2', 'GroupSkill3', 'GroupSkill4', "ProjectSkill1", "ProjectSkill2", "ProjectSkill3", "ProjectSkill4", ], axis=1, inplace=True)
查看这篇文章,了解为您的非数字数据创建特征列的建议:https ://www.tensorflow.org/tutorials/structured_data/feature_columns
推荐阅读
- java - JPQL 查询 CASE WHEN 始终为假
- python - 而不是直接绘制,需要绘制平滑折线图python
- azure - Azure VNET 子网委派
- c# - How do I set up and use de4dot?
- redirect - WGET to follow redirections from one domain to other
- python - GAN's generator gradients are None
- python - ValueError: Input 0 of layer dense is incompatible with the layer: expected axis -1 to have value 8 but received input with shape [None, 1]
- javascript - 比较并将属性添加到对象数组
- javascript - Chartjs annotations-plugin: Can I add multiple vertical lines to identically named x-axis labels?
- javascript - How to add two numbers in ReactJS with using class component and show the answer in alert box