首页 > 解决方案 > 如何使用 Keras/Tensorflow 在两点获得渐变

问题描述

我试图在自变量的两个不同值处获得函数的梯度。我是在修改get_updatesKeras 优化器类的方法的情况下这样做的。

我编写的代码的相关部分是

def get_updates(self, loss, params):
    grads = self.get_gradients(loss, params)
    self.updates = [K.update_add(self.iterations, 1)]

    # Copy parameters
    params2 = []
    for p in params:
        params2.append(K.variable(K.get_value(p), name=p.name[:-2] + "_cpy1/"))

    self.weights = [self.iterations]
    for p, p2, g in zip(params, params2, grads):
        v = - self.lr * g  
        new_p = p + 0.5*v # Intermediate/Partial step
        new_p2 = p2 + v   # Reference point for 2nd Gradient

    grads2 = self.get_gradients(loss, params2)
    ....

执行此代码时出现的错误是

  File "/home/me/Projects/RungeKutta/rk_optimizers2.py", line 86, in get_updates
    grads2 = self.get_gradients(loss, params2)
  File "/home/me/anaconda2/lib/python2.7/site-packages/keras/optimizers.py", line 91, in get_gradients
    raise ValueError('An operation has `None` for gradient. '
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Keras 的 optimizers.py 中有问题的代码部分是:

def get_gradients(self, loss, params):
    grads = K.gradients(loss, params)
    if None in grads:
        raise ValueError('An operation has `None` for gradient. '
                         'Please make sure that all of your ops have a '
                         'gradient defined (i.e. are differentiable). '
                         'Common ops without gradient: '
                         'K.argmax, K.round, K.eval.')

就我而言,K对应于 Tensorflow。所以,对应的代码gradients是:

def gradients(loss, variables):
    """Returns the gradients of `loss` w.r.t. `variables`.

    # Arguments
        loss: Scalar tensor to minimize.
        variables: List of variables.

    # Returns
        A gradients tensor.
    """
    return tf.gradients(loss, variables, colocate_gradients_with_ops=True)

为什么我会收到我看到的错误?如何在第一个投影的更新参数处获得梯度,以便我可以使用起点处的梯度和(最初)投影终点处的梯度的平均值来获得最终更新?

标签: pythontensorflowkeras

解决方案


推荐阅读