为什么谷歌Colab TPU和我的电脑一样慢

由于我有一个很大的数据集，而且我的电脑没有太多的功能，我认为在Google Colab上使用TPU是个好主意。

所以，这是我的TPU配置：

try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
print('Running on TPU ', tpu.master())
except ValueError:
tpu = None
if tpu:
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
strategy = tf.distribute.experimental.TPUStrategy(tpu)
else:
strategy = tf.distribute.get_strategy()

print("REPLICAS: ", strategy.num_replicas_in_sync)

这是我的训练：

hist = model.fit(train_dataset, epochs=10, verbose=1, steps_per_epoch=count_data_items(filenames)//64)

仅仅创建一个策略是不够的。你应该正确使用这个策略。

你可能需要调整你的管道，增加批量大小，等等。

看看这里：https://cloud.google.com/tpu/docs/performance-guide

另一个重要的点是TPU有一个预热期——它在第一次调用(每次调用都有一个新的输入形状(期间花费大量时间构建计算图。

目前可用于Colab笔记本电脑的TPU核心数量为8个。要点：从观察训练时间可以看出，当批量较小时，TPU比GPU需要更多的训练时间。但当批量大小增加时，TPU的性能与GPU相当。通过此链接了解更多详细信息

相关内容

最新更新

热门标签：