tensorflow - tf 优化器 compute_gradients 错误维度大小
问题描述
我正在尝试利用DPGradientDescentGaussianOptimizer
来计算compute_gradients。
通过跟踪文件,sparse_softmax_cross_entropy_with_logits
然后用于tf.reduce_mean()
计算损失。
crx_entropy_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
labels=context[:, 1:], logits=output['logits'][:, :-1])
print('crx_entropy_loss', crx_entropy_loss.shape, crx_entropy_loss)
loss = tf.reduce_mean(crx_entropy_loss)
print('reduce_mean_loss', loss.shape, loss)
哪个,
crx_entropy_loss ->
shape: (1, ?) from Tensor("SparseSoftmaxCrossEntropyWithLogits/Reshape_2:0", shape=(1, ?), dtype=float32)
reduce_mean_loss ->
shape: () from Tensor("Mean:0", shape=(), dtype=float32)
但是,当我尝试计算优化器梯度compute_gradients
时,它会显示尺寸错误。
opt = dp_optimizer.DPGradientDescentGaussianOptimizer(
l2_norm_clip=l2_norm_clip,
noise_multiplier=noise_multiplier,
num_microbatches= batch_size,
unroll_microbatches=True,
ledger=ledger,
learning_rate=learning_rate )
opt_grads = opt.compute_gradients(loss, train_vars)
错误:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 1607, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl
.
InvalidArgumentError
:
Dimension size must be evenly divisible by 128 but is 1 for 'Reshape' (op: 'Reshape') with input shapes: [], [2] and with input tensors computed as partial shapes: input[1] = [128,?].
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "GenerateTextMy.py", line 80, in <module>
main()
File "GenerateTextMy.py", line 50, in main
gpt2.finetune(sess, folder, batch_size = 128, multi_gpu = True, sample_every = 10000,
File "/home/gpt_2_simple_dp/gpt_2.py", line 253, in finetune
opt_grads = opt.compute_gradients(loss, train_vars)
File "/usr/local/lib/python3.8/dist-packages/tensorflow_privacy/privacy/optimizers/dp_optimizer.py", line 134, in compute_gradients
microbatches_losses = tf.reshape(loss, [self._num_microbatches, -1])
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/ops/array_ops.py", line 131, in reshape
result = gen_array_ops.reshape(tensor, shape, name)
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/ops/gen_array_ops.py", line 8114, in reshape
_, _, _op = _op_def_lib._apply_op_helper(
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 792, in _apply_op_helper
op = g.create_op(op_type_name, inputs, dtypes=None, name=scope,
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 3356, in create_op
return self._create_op_internal(op_type, inputs, dtypes, input_types, name,
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 3418, in _create_op_internal
ret = Operation(
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 1769, in __init__
self._c_op = _create_c_op(self._graph, node_def, grouped_inputs,
File "/usr/local/lib/python3.8/dist-packages/tensorflow_core/python/framework/ops.py", line 1610, in _create_c_op
raise ValueError(str(e))
ValueError
:
Dimension size must be evenly divisible by 128 but is 1 for 'Reshape' (op: 'Reshape') with input shapes: [], [2] and with input tensors computed as partial shapes: input[1] = [128,?].
我做了谷歌并查看了相关主题的 StackOverflow,但是,我仍然不太明白这个问题。我的代码只有在批量大小为 1 时才能执行。我可以请教一些建议或帮助吗?
另外,为什么我的loss是shape 0或者None,那么会报错呢?
版本:Tensorflow-1.15 Python 3.8.10
谢谢!
解决方案
在 DPGradientDescentGaussianOptimizer 的 Tensorflow 文档中,它是这样写的:
# Compute loss as a tensor. Do not call tf.reduce_mean as you
# would with a standard optimizer.
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
labels=labels, logits=logits)
train_op = opt.minimize(loss, global_step=global_step)
问题是您使用的是 tf.reduce_mean
推荐阅读
- python - CS50 PSET6 DNA 不匹配使用正则表达式计算 STR
- python - 如何将张量保存到 TFRecord?
- python - 每天使用python自动创建一个文件夹
- python - 如何将列数据制作成列标题并在其中添加其他列数据
- django-rest-framework - Azure Active Directory、Django Rest 社交身份验证
- python - 如何使用opencv在pyqt5中同时在不同窗口中运行一个网络摄像头
- unit-testing - 我如何模拟/伪造从另一个服务继承的服务中的方法并返回一个 Observable?
- javascript - 如何完成滚动类更改?
- keras - 使用提前停止 - gridsearchcv - kerasregressor
- c# - C# JSON 西班牙语是 UTF8 但德语是 UTF7 加载