用于opt.apply_gradients的 zip 对象



有一个程序,包括一个优化函数,它有以下代码段来计算梯度

if hypes['clip_norm'] > 0:
        grads, tvars = zip(*grads_and_vars)
        clip_norm = hypes["clip_norm"]
        clipped_grads, norm = tf.clip_by_global_norm(grads, clip_norm)
        grads_and_vars = zip(clipped_grads, tvars)
        print('grads_and_vars ',grads_and_vars)
    train_op = opt.apply_gradients(grads_and_vars, global_step=global_step)
    update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
    with tf.control_dependencies(update_ops):
        train_op = (grads_and_vars,
                                       global_step=global_step)

但是,运行该程序会引发以下错误

File "/home/FCN/kittiseg/hypes/../optimizer/generic_optimizer.py", line 92, in training
    train_op = opt.apply_gradients(grads_and_vars, global_step=global_step)
  File "tensorflow/tf_0.12/lib/python3.4/site-packages/tensorflow/python/training/optimizer.py", line 370, in apply_gradients
    raise ValueError("No variables provided.")
ValueError: No variables provided.

我深入研究了代码,并认为它是由变量 grads_and_var 引起的。我把它打印出来,这只是grads_and_vars <zip object at 0x2b0d6c27e348>.但我不知道如何分析它以及导致 train_op = opt.apply_gradients(grads_and_vars, global_step=global_step(

失败?

这是原始的训练功能

def training(hypes, loss, global_step, learning_rate, opt=None):
"""Sets up the training Ops.
Creates a summarizer to track the loss over time in TensorBoard.
Creates an optimizer and applies the gradients to all trainable variables.
The Op returned by this function is what must be passed to the
`sess.run()` call to cause the model to train.
Args:
  loss: Loss tensor, from loss().
  global_step: Integer Variable counting the number of training steps
    processed.
  learning_rate: The learning rate to use for gradient descent.
Returns:
  train_op: The Op for training.
"""
# Add a scalar summary for the snapshot loss.''
sol = hypes["solver"]
hypes['tensors'] = {}
hypes['tensors']['global_step'] = global_step
total_loss = loss['total_loss']
with tf.name_scope('training'):
    if opt is None:
        if sol['opt'] == 'RMS':
            opt = tf.train.RMSPropOptimizer(learning_rate=learning_rate,
                                            decay=0.9,
                                            epsilon=sol['epsilon'])
        elif sol['opt'] == 'Adam':
            opt = tf.train.AdamOptimizer(learning_rate=learning_rate,
                                         epsilon=sol['adam_eps'])
        elif sol['opt'] == 'SGD':
            lr = learning_rate
            opt = tf.train.GradientDescentOptimizer(learning_rate=lr)
        else:
            raise ValueError('Unrecognized opt type')
    hypes['opt'] = opt
    grads_and_vars = opt.compute_gradients(total_loss)
    if hypes['clip_norm'] > 0:
        grads, tvars = zip(*grads_and_vars)
        clip_norm = hypes["clip_norm"]
        clipped_grads, norm = tf.clip_by_global_norm(grads, clip_norm)
        grads_and_vars = zip(clipped_grads, tvars)
    train_op = opt.apply_gradients(grads_and_vars, global_step=global_step)
    update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
    with tf.control_dependencies(update_ops):
        train_op = opt.apply_gradients(grads_and_vars,
                                       global_step=global_step)
return train_op
似乎是

渐变剪辑部分中的错误。我遇到了同样的问题,对如何正确执行此操作进行了一些研究(请参阅下面的来源(,现在似乎可以工作。

替换该节

grads, tvars = zip(*grads_and_vars)
clip_norm = hypes["clip_norm"]
clipped_grads, norm = tf.clip_by_global_norm(grads, clip_norm)
grads_and_vars = zip(clipped_grads, tvars)

clip_norm = hypes["clip_norm"]
grads_and_vars = [(tf.clip_by_value(grad, -clip_norm, clip_norm), var) 
                                       for grad, var in grads_and_vars]

它应该有效。

来源:如何在张量流中有效应用梯度裁剪?

我相信

tf.clip_by_value对梯度值的影响与tf.clip_by_global_norm不同。

显然,tf.clip_by_value将每个渐变值独立地裁剪到剪辑范围内,而tf.clip_by_global_norm计算所有梯度值的总范数,并以每个渐变值适合剪辑范围的方式重新缩放每个值,同时保留每个渐变值之间的比例。

为了说明这两个函数之间的差异,假设我们有

原始渐变 = [2.0, 1.0, 2.0]

tf.clip_by_value(渐变, -1.0, 1.0

( 将导致梯度为 [1.0, 1.0, 1.0]

tf.clip_by_global_norm(梯度,1.0(将导致梯度为 [1.0, 0.5, 1.0]

要回答原始问题,对我有用的是我必须将zip对象转换为列表,如下所示:

grads, tvars = zip(*grads_and_vars)
(clipped_grads, _) = tf.clip_by_global_norm(grads, clip_norm=1.0)
grads_and_vars = list(zip(clipped_grads, tvars))

最新更新