TensorFlow ops 在哪里运行?CPU 还是 GPU?



查看代码:

import tensorflow as tf
with tf.device('/gpu:0'):
with tf.device('/cpu:0'):
x = tf.constant(0,name='x')
x = x * 2
y = x + 2
config = tf.ConfigProto(log_device_placement=True) 
with tf.Session(config=config) as sess:
sess.run(y)

当你运行它时,你会得到结果。

mul: (Mul): /job:localhost/replica:0/task:0/cpu:0
2017-08-11 21:38:23.953846: I c:tf_jenkinshomeworkspacerelease-winmwindows
-gpupy35tensorflowcorecommon_runtimesimple_placer.cc:847] mul: (Mul)/job:l
ocalhost/replica:0/task:0/cpu:0
add: (Add): /job:localhost/replica:0/task:0/gpu:0
2017-08-11 21:38:23.954846: I c:tf_jenkinshomeworkspacerelease-winmwindows
-gpupy35tensorflowcorecommon_runtimesimple_placer.cc:847] add: (Add)/job:l
ocalhost/replica:0/task:0/gpu:0
add/y: (Const): /job:localhost/replica:0/task:0/gpu:0
2017-08-11 21:38:23.954846: I c:tf_jenkinshomeworkspacerelease-winmwindows
-gpupy35tensorflowcorecommon_runtimesimple_placer.cc:847] add/y: (Const)/j
ob:localhost/replica:0/task:0/gpu:0
mul/y: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-08-11 21:38:23.954846: I c:tf_jenkinshomeworkspacerelease-winmwindows
-gpupy35tensorflowcorecommon_runtimesimple_placer.cc:847] mul/y: (Const)/j
ob:localhost/replica:0/task:0/cpu:0
x: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-08-11 21:38:23.954846: I c:tf_jenkinshomeworkspacerelease-winmwindows
-gpupy35tensorflowcorecommon_runtimesimple_placer.cc:847] x: (Const)/job:l
ocalhost/replica:0/task:0/cpu:0

这意味着mulcpu上运行,addgpu上运行。所以我得出的结论是where does ops or tensors define where does ops or tensors run.

当我看到《盗梦空间》时,我感到困惑。

with slim.arg_scope([slim.variables.variable], device='/cpu:0'):
# Calculate the loss for one tower of the ImageNet model. This
# function constructs the entire ImageNet model but shares the
# variables across all towers.
loss = _tower_loss(images_splits[i], labels_splits[i], num_classes,
scope, reuse_variables)

tower_loss在 CPU 上定义,这意味着根据我的结论,每个 GPU 都会在 CPU 上运行损失。但我认为每个 GPU 都应该运行一个副本 在 GPU 上。那我是不是误会了?

父设备分配将被子设备分配覆盖。

在下面的代码中,函数_tower_loss/gpu:i内有另一个设备分配(如果你查看实现(。损失是使用 GPU 计算的,但损失是收集和平均的cpu.

loss = _tower_loss(images_splits[i], labels_splits[i], num_classes,
scope, reuse_variables)

最新更新