重复推理后,模型推理运行时间增加



我正在编写一个 tensorflow 项目,在其中我手动编辑每个权重和偏差,所以我像使用字典中的旧 tensorflow 一样设置权重和偏差,而不是使用tf.layers.dense并让 tensorflow 负责更新权重。(这是我想出的最干净的方法,尽管它可能并不理想(

我在每次迭代中向固定模型提供相同的数据,但在整个程序执行过程中运行时间都会增加。

我从代码中删除了几乎所有内容,因此我可以看到问题所在,但我无法理解导致运行时间增加的原因。

---Games took   2.6591222286224365 seconds ---
---Games took   3.290001153945923 seconds ---
---Games took   4.250034332275391 seconds ---
---Games took   5.190149307250977 seconds ---

编辑:我已经设法通过使用一个占位符来减少运行时间,该占位符不会向图形添加额外的节点,但运行时间仍然以较慢的速度增加。我想删除这种运行时间增长。(一段时间后从 0.1 秒变为 1 秒以上(

这是我的整个代码:

import numpy as np
import tensorflow as tf
import time
n_inputs = 9
n_class = 9
n_hidden_1 = 20
population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster
# 2 games per individual
games_in_generation = population_size/2

def generate_initial_population(my_population_size):
my_weights = []
my_biases = []
for key in range(my_population_size):
layer_weights = {
'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
}
layer_biases = {
'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
}
my_weights.append(layer_weights)
my_biases.append(layer_biases)
return my_weights, my_biases

weights, biases = generate_initial_population(population_size)
data = tf.placeholder(dtype=tf.float32) #will add shape later
def model(x):
out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
return out_layer

def play_game():

model_input = [0] * 9
model_out = model(data)
for game_step in range(game_steps):
move = sess.run(model_out, feed_dict={data: model_input})[0]

sess = tf.Session()
sess.run(tf.global_variables_initializer())
while True:
start_time = time.time()
for _ in range(int(games_in_generation)):
play_game()
print("---Games took   %s seconds ---" % (time.time() - start_time))

我正在添加另一个答案,因为对问题的最新编辑产生了实质性的变化。您仍然看到运行时间的增长,因为您仍然在sess中多次调用model。您只是降低了向图形添加节点的频率。您需要做的是为要构建的每个模型创建一个新会话,并在完成每个会话后关闭每个会话。我已经修改了您的代码以执行此操作,在这里:

import numpy as np
import tensorflow as tf
import time

n_inputs = 9
n_class = 9
n_hidden_1 = 20
population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster
# 2 games per individual
games_in_generation = population_size/2

def generate_initial_population(my_population_size):
my_weights = []
my_biases = []
for key in range(my_population_size):
layer_weights = {
'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
}
layer_biases = {
'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
}
my_weights.append(layer_weights)
my_biases.append(layer_biases)
return my_weights, my_biases

def model(x):
out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
return out_layer

def play_game(sess):
model_input = [0] * 9
model_out = model(data)
for game_step in range(game_steps):
move = sess.run(model_out, feed_dict={data: model_input})[0]
while True:
for _ in range(int(games_in_generation)):
# Reset the graph.
tf.reset_default_graph()
weights, biases = generate_initial_population(population_size)
data = tf.placeholder(dtype=tf.float32) #will add shape later
# Create session.
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
start_time = time.time()
play_game(sess)
print("---Games took   %s seconds ---" % (time.time() - start_time))
sess.close()

我在这里所做的是将play_game调用包装在with范围内定义的会话中,并在调用play_game后退出该sess.close会话。我还重置了默认图形。我已经运行了几百次迭代,并且没有看到运行时间增加。

这段代码中发生了一些奇怪的事情,所以给你一个真正解决根本问题的答案会很棘手。但是,我可以解决您观察到的运行时间的增长。下面,我修改了您的代码,以从游戏循环中提取输入模式生成和对model的调用。

import numpy as np
import tensorflow as tf
import time
n_inputs = 9
n_class = 9
n_hidden_1 = 20
population_size = 10
weights = []
biases = []
game_steps = 20 #so we can see performance loss faster
# 2 games per individual
games_in_generation = population_size/2

def generate_initial_population(my_population_size):
my_weights = []
my_biases = []
for key in range(my_population_size):
layer_weights = {
'h1': tf.Variable(tf.truncated_normal([n_inputs, n_hidden_1], seed=key)),
'out': tf.Variable(tf.truncated_normal([n_hidden_1, n_class], seed=key))
}
layer_biases = {
'b1': tf.Variable(tf.truncated_normal([n_hidden_1], seed=key)),
'out': tf.Variable(tf.truncated_normal([n_class], seed=key))
}
my_weights.append(layer_weights)
my_biases.append(layer_biases)
return my_weights, my_biases

weights, biases = generate_initial_population(population_size)

def model(x):
out_layer = tf.add(tf.matmul([biases[1]['b1']], weights[1]['out']),  biases[1]['out'])
return out_layer

def play_game():
# Extract input pattern generation.
model_input = np.float32([[0]*9])
model_out = model(model_input)
for game_step in range(game_steps):
start_time = time.time()
move = sess.run(model_out)[0]
# print("---Step took   %s seconds ---" % (time.time() - start_time))

sess = tf.Session()
sess.run(tf.global_variables_initializer())
for _ in range(5):
start_time = time.time()
for _ in range(int(games_in_generation)):
play_game()
print("---Games took   %s seconds ---" % (time.time() - start_time))

如果运行,此代码应如下所示:

---Games took   0.42223644256591797 seconds ---
---Games took   0.13168787956237793 seconds ---
---Games took   0.2452383041381836 seconds ---
---Games took   0.20023465156555176 seconds ---
---Games took   0.19905781745910645 seconds ---

显然,这解决了您观察到的运行时间增长。它还将观察到的最大运行时间缩短了一个数量级!发生这种情况的原因是,每次调用model时,您实际上是在创建一堆tf.Tensor对象,您试图将这些对象添加到图形中。这种误解很常见,并且是由于您尝试在命令式 python 代码中使用张量造成的,就好像它们是 python 变量一样。我建议在继续之前查看所有图表指南。

同样重要的是要注意,这不是在TensorFlow中将值传递给图形的正确方法。我可以看到您希望在游戏的每次迭代期间将不同的值传递给您的模型,但您无法通过将值传递给 python 函数来实现这一点。必须在模型图中创建一个tf.placeholder,并将希望模型处理的值加载到该占位符上。有很多方法可以做到这一点,但你可以在这里找到一个例子。希望对您有所帮助!

最新更新