如何在tensorflow中实现动态增强?



我想用tensorflow模型实现3D数据集的增强。

增强函数是这样的:

def augmentation(img, label):

p = .5
print('augmentation')

if random.random() > p:
img = tf.numpy_function(augment_noise, [img], tf.double)

if random.random() > p:
img = tf.numpy_function(flip_x, [img], tf.double)

if random.random() > p:
img = tf.numpy_function(augment_scale, [img], tf.double)

if random.random() > p:
img = tf.numpy_function(distort_elastic_cv2, [img], tf.double)    


img = tf.image.convert_image_dtype(img, tf.float32)

return img, label

增强函数在tensorflow中没有实现。

使用这个函数的Tensorflow代码如下:

ds_train = tf.data.Dataset.from_tensor_slices((image_train, label_train))
ds_valid = tf.data.Dataset.from_tensor_slices((image_val, label_val))

batch_size = 16
repeat_count = int((1000 * batch_size)/len(image_train))
# AUTOTUNE =  tf.data.experimental.AUTOTUNE # tf.data.AUTOTUNE
AUTOTUNE = 16
# Augment the on the fly during training.
ds_train = (
ds_train.shuffle(len(ds_train)).repeat(repeat_count)
.map(augmentation, num_parallel_calls=AUTOTUNE)
.batch(batch_size)
.prefetch(buffer_size=AUTOTUNE)
)

ds_valid = (
ds_valid.batch(batch_size)
.prefetch(buffer_size=AUTOTUNE)
)
initial_epoch = 0
epochs = 1000
H = model.fit(ds_train, validation_data=ds_valid,initial_epoch=initial_epoch,
epochs = epochs,
callbacks = chkpts, use_multiprocessing=False, workers=1, verbose=2)

我想在每个epoch随机从数据集中选择大约1000批,然后在。I计算repeat_count以创建规模为batch_size的1000批。

问题是我不知道每个epoch中的模型调用增强函数并将其隐含到批处理的每个图像中(我的意思是每个epoch中有161000个图像),因此我在augmentation函数中添加了print,它只打印一次,而不是在每个epoch中,或者为每个图像。扩充函数是否在每个历元中调用161000次?

而且cpu和gpu的利用率在每次运行代码时都是不同的。有时cpu的利用率约为25%,gpu为30,但在几乎所有的运行中,它是100%和5.

如何解决这两个问题?

你的字符串被打印一次,因为它调用一次来制作Tensorflow图。如果您使用tf.print来打印,它将成为图形的一部分,因此它将每次打印。

复制/粘贴:

import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.datasets import load_sample_image
import numpy as np
import random
imgs = np.stack([load_sample_image('flower.jpg') for i in range(4*4)], axis=0)
def augmentation(img):      
p = .5
tf.print('augmentation successful!')  
img = tf.image.convert_image_dtype(img, tf.float32)
return img 

ds_train = tf.data.Dataset.from_tensor_slices(imgs)

batch_size = 16
repeat_count = 10
AUTOTUNE = 16
ds_train = (
ds_train.shuffle(len(ds_train)).repeat(repeat_count)
.map(augmentation, num_parallel_calls=AUTOTUNE)
.batch(batch_size)
.prefetch(buffer_size=AUTOTUNE)
)
for i in ds_train:
pass
augmentation successful!
augmentation successful!
augmentation successful!
augmentation successful!
augmentation successful!
augmentation successful!
augmentation successful!
augmentation successful!
augmentation successful!

最新更新