tensorflow - 尝试将批处理从 Tensorflow 数据集 API 传递到我的会话操作时,为什么会出现形状错误?
问题描述
我正在处理转换为 Dataset API 的问题,我想我只是没有足够的 API 经验来了解如何处理以下情况。我们目前使用队列和批处理执行图像增强。我的任务是检查新的 Dataset API 并使用它而不是队列转换我们现有的实现。
我们想要做的是获取对所有路径的引用并仅从该引用处理所有操作。正如您在数据集初始化中看到的那样,我已将 parse_fn 映射到数据集本身,然后读取文件并从文件名中提取初始值。但是,当我开始调用迭代器的 next_batch 方法并将这些值传递给 get_summary 时,我现在遇到了关于形状的错误。我一直在尝试一些不断改变错误的事情,所以我觉得我应该看看 SO 上的任何人是否可能看到我正在处理这一切错误并且应该采取不同的路线。在我使用数据集 API 的过程中,有什么东西绝对是错误的吗?
我不应该再以这种方式调用操作了吗?我注意到我看到的大多数示例都会获取批处理,将变量传递给操作,然后将其捕获到变量中并将其传递给 sess.run,但是我还没有找到一种简单的方法来做到这一点但是我们的设置没有出错,所以这是我采用的方法(但它仍然出错)。如果我发现任何问题,我将继续尝试追查问题并在此处发布,但如果有人看到某些内容,请提出建议。谢谢!
当前错误:
... 在 get_summary 摘要中,acc = sess.run([self._summary_op, self._accuracy], feed_dict=feed_dict) ValueError: 无法为张量“ph_input_labels:0”提供形状 (32,) 的值,其形状为“ (?, 1)
下面是调用 get_summary 方法并触发错误的块:
def perform_train():
if __name__ == '__main__':
#Get all our image paths
filenames = data_layer_train.get_image_paths()
next_batch, iterator = preproc_image_fn(filenames=filenames)
with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
with sess.graph.as_default():
# Set the random seed for tensorflow
tf.set_random_seed(cfg.RNG_SEED)
classifier_network = c_common.create_model(len(products_to_class_dict), is_training=True)
optimizer, global_step_var = c_common.create_optimizer(classifier_network)
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
# Init tables and dataset iterator
sess.run(tf.tables_initializer())
sess.run(iterator.initializer)
cur_epoch = 0
blobs = None
try:
epoch_size = data_layer_train.get_steps_per_epoch()
num_steps = num_epochs * epoch_size
for step in range(num_steps):
timer_summary.tic()
if blobs is None:
#Now populate from our training dataset
blobs = sess.run(next_batch)
# *************** Below is where it is erroring *****************
summary_train, acc = classifier_network.get_summary(sess, blobs["images"], blobs["labels"], blobs["weights"])
...
相信错误在 preproc_image_fn 中:
def preproc_image_fn(filenames, images=None, labels=None, image_paths=None, cells=None, weights=None):
def _parse_fn(filename, label, weight):
augment_instance = False
paths=[]
selected_cells=[]
if vals.FIRST_ITER:
#Perform our check of the path to see if _data_augmentation is within it
#If so set augment_instance to true and replace the substring with an empty string
new_filename = tf.regex_replace(filename, "_data_augmentation", "")
contains = tf.equal(tf.size(tf.string_split([filename], "")), tf.size(tf.string_split([new_filename])))
filename = new_filename
if contains is True:
augment_instance = True
core_file = tf.string_split([filename], '\\').values[-1]
product_id = tf.string_split([core_file], ".").values[0]
label = search_tf_table_for_entry(product_id)
weight = data_layer_train.get_weights(product_id)
image_string = tf.read_file(filename)
img = tf.image.decode_image(image_string, channels=data_layer_train._channels)
img.set_shape([None, None, None])
img = tf.image.resize_images(img, [data_layer_train._target_height, data_layer_train._target_width])
#Previously I was returning the below, but I was getting an error from the op when assigning feed_dict stating that it didnt like the dictionary
#retval = dict(zip([filename], [img])), label, weight
retval = img, label, weight
return retval
num_files = len(filenames)
filenames = tf.constant(filenames)
#*********** Setup dataset below ************
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels, weights))
dataset=dataset.map(_parse_fn)
dataset = dataset.repeat()
dataset = dataset.batch(32)
iterator = dataset.make_initializable_iterator()
batch_features, batch_labels, batch_weights = iterator.get_next()
return {'images': batch_features, 'labels': batch_labels, 'weights': batch_weights}, iterator
def search_tf_table_for_entry(self, product_id):
'''Looks up keys in the table and outputs the values. Will return -1 if not found '''
if product_id is not None:
return self._products_to_class_table.lookup(product_id)
else:
if not self._real_eval:
logger().info("class not found in training {} ".format(product_id))
return -1
我在哪里创建模型并使用以前使用的占位符:
...
def create_model(self):
weights_regularizer = tf.contrib.layers.l2_regularizer(cfg.TRAIN.WEIGHT_DECAY)
biases_regularizer = weights_regularizer
# Input data.
self._input_images = tf.placeholder(
tf.float32, shape=(None, self._image_height, self._image_width, self._num_channels), name="ph_input_images")
self._input_labels = tf.placeholder(tf.int64, shape=(None, 1), name="ph_input_labels")
self._input_weights = tf.placeholder(tf.float32, shape=(None, 1), name="ph_input_weights")
self._is_training = tf.placeholder(tf.bool, name='ph_is_training')
self._keep_prob = tf.placeholder(tf.float32, name="ph_keep_prob")
self._accuracy = tf.reduce_mean(tf.cast(self._correct_prediction, tf.float32))
...
self.create_summaries()
def create_summaries(self):
val_summaries = []
with tf.device("/cpu:0"):
for var in self._act_summaries:
self._add_act_summary(var)
for var in self._train_summaries:
self._add_train_summary(var)
self._summary_op = tf.summary.merge_all()
self._summary_op_val = tf.summary.merge(val_summaries)
def get_summary(self, sess, images, labels, weights):
feed_dict = {self._input_images: images, self._input_labels: labels,
self._input_weights: weights, self._is_training: False}
summary, acc = sess.run([self._summary_op, self._accuracy], feed_dict=feed_dict)
return summary, acc
解决方案
由于错误说:
无法为张量“ph_input_labels:0”提供形状(32,)的值,其形状为“(?,1)
我猜你里面labels
有get_summary
形状[32]
。你能把它改造成 (32, 1) 吗?或者也许在早些时候重塑标签_parse_fn
?
推荐阅读
- python - 使用追加添加到字典内列表中的值
- xamarin - 如何在 Akavache 中订阅新价值?
- python - 从主机名中提取 AWS AZ
- c# - Dapper QueryAsync,返回一个列表
- html - 如何让这个简单的 css-grid 布局在 IE11 中工作
- c++ - 如何重载由日期和月份组成的对象的前置和后置增量运算符,以打印为 std::string?
- node.js - 在每个区域上创建具有特定权限的 MongoDB userSchema
- c# - 如何在 .NET Core 中自动发送 HttpClient get 方法
- android - 允许一个参数使用不同的类型,例如 String 和 StringRes int 的总和类型
- javascript - 如何从输入框中添加新的数组元素并将其显示在列表中?