Tensorflow:在多gpu训练中将变量钉在CPU上不工作

我正在使用tensorflow训练我的第一个多gpu模型。正如教程所述，变量被固定到CPU上，每个GPU上的操作使用name_scope。

当我正在运行一个小测试并记录设备放置时，我可以看到ops被放置在带有TOWER_1/TOWER_0前缀的各自GPU上，但变量没有被放置在CPU上。

是我错过了什么，还是我错误地理解了设备放置日志。

附加测试代码，这里是设备放置日志

感谢<<p> 测试代码/strong>
with tf.device('cpu:0'): imgPath=tf.placeholder(tf.string) imageString=tf.read_file(imgPath) imageJpeg=tf.image.decode_jpeg(imageString, channels=3) inputImage=tf.image.resize_images(imageJpeg, [299,299]) inputs = tf.expand_dims(inputImage, 0) for i in range(2): with tf.device('/gpu:%d' % i): with tf.name_scope('%s_%d' % ('TOWER', i)) as scope: with slim.arg_scope([tf.contrib.framework.python.ops.variables.variable], device='/cpu:0'): with slim.arg_scope(inception_v3.inception_v3_arg_scope()): logits,endpoints = inception_v3.inception_v3(inputs, num_classes=1001, is_training=False) tf.get_variable_scope().reuse_variables() with tf.Session(config=tf.ConfigProto(allow_soft_placement=True,log_device_placement=True)) as sess: tf.initialize_all_variables().run() exit(0)
编辑基本上，line with slim.arg_scope([f.contrib.framework.python.ops.variables.variable]， device='/cpu:0'):'应该强制cpu上的所有变量，但它们是在'gpu:0'

上创建的

Try with:

with slim.arg_scope([slim.model_variable, slim.variable], device='/cpu:0'):

摘自:model_deploy

相关内容

最新更新

热门标签：