首页 > 解决方案 > tf 优化器 compute_gradients 错误维度大小

问题描述

我正在尝试利用DPGradientDescentGaussianOptimizer来计算compute_gradients。

通过跟踪文件sparse_softmax_cross_entropy_with_logits然后用于tf.reduce_mean()计算损失。

    crx_entropy_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
                        labels=context[:, 1:], logits=output['logits'][:, :-1])
    print('crx_entropy_loss', crx_entropy_loss.shape, crx_entropy_loss)
    loss = tf.reduce_mean(crx_entropy_loss)
    print('reduce_mean_loss', loss.shape, loss)

哪个,

crx_entropy_loss -> 
shape: (1, ?) from Tensor("SparseSoftmaxCrossEntropyWithLogits/Reshape_2:0", shape=(1, ?), dtype=float32)

reduce_mean_loss -> 
shape: () from Tensor("Mean:0", shape=(), dtype=float32)

但是,当我尝试计算优化器梯度compute_gradients时,它会显示尺寸错误。

opt = dp_optimizer.DPGradientDescentGaussianOptimizer(
            l2_norm_clip=l2_norm_clip,
            noise_multiplier=noise_multiplier,
            num_microbatches= batch_size, 
            unroll_microbatches=True,
            ledger=ledger,
            learning_rate=learning_rate )

opt_grads = opt.compute_gradients(loss, train_vars)

错误:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 1607, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl
.
InvalidArgumentError
:
Dimension size must be evenly divisible by 128 but is 1 for 'Reshape' (op: 'Reshape') with input shapes: [], [2] and with input tensors computed as partial shapes: input[1] = [128,?].

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "GenerateTextMy.py", line 80, in <module>
main()
  File "GenerateTextMy.py", line 50, in main
gpt2.finetune(sess, folder, batch_size = 128, multi_gpu = True, sample_every = 10000,
  File "/home/gpt_2_simple_dp/gpt_2.py", line 253, in finetune
opt_grads = opt.compute_gradients(loss, train_vars)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_privacy/privacy/optimizers/dp_optimizer.py", line 134, in compute_gradients
microbatches_losses = tf.reshape(loss, [self._num_microbatches, -1])
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/ops/array_ops.py", line 131, in reshape
result = gen_array_ops.reshape(tensor, shape, name)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/ops/gen_array_ops.py", line 8114, in reshape
_, _, _op = _op_def_lib._apply_op_helper(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 792, in _apply_op_helper
op = g.create_op(op_type_name, inputs, dtypes=None, name=scope,
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func
return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 3356, in create_op
return self._create_op_internal(op_type, inputs, dtypes, input_types, name,
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 3418, in _create_op_internal
ret = Operation(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 1769, in __init__
self._c_op = _create_c_op(self._graph, node_def, grouped_inputs,
  File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 1610, in _create_c_op
raise ValueError(str(e))
ValueError
:
Dimension size must be evenly divisible by 128 but is 1 for 'Reshape' (op: 'Reshape') with input shapes: [], [2] and with input tensors computed as partial shapes: input[1] = [128,?].

我做了谷歌并查看了相关主题的 StackOverflow,但是,我仍然不太明白这个问题。我的代码只有在批量大小为 1 时才能执行。我可以请教一些建议或帮助吗?

另外,为什么我的loss是shape 0或者None,那么会报错呢?

版本:Tensorflow-1.15 Python 3.8.10

谢谢!

标签: tensorflowreshape

解决方案


在 DPGradientDescentGaussianOptimizer 的 Tensorflow 文档中,它是这样写的:

# Compute loss as a tensor. Do not call tf.reduce_mean as you
# would with a standard optimizer.
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
    labels=labels, logits=logits)

train_op = opt.minimize(loss, global_step=global_step)

问题是您使用的是 tf.reduce_mean


推荐阅读