我的问题是:在预处理期间,我想使用tf.data.Dataset
和tf.function
API从一组函数中随机选择一个函数应用于数据集示例。
具体来说,我的数据是3D体积,我希望从一组24个预定义的旋转函数中应用旋转。我想在tf.function
中编写这段代码,这样就限制了numpy
和列表索引等包的使用。
例如,我想这样做:
import tensorflow as tf
@tf.function
def func1(tensor):
# Apply some rotation here
...
@tf.function
def func2(tensor):
...
...
@tf.function
def func24(tensor):
...
@tf.function
def apply(tensor):
list_of_funcs = [func1, func2, ..., func24]
# Randomly sample from 0-23
a = tf.random.uniform([1], minval=0, maxval=23, dtype=tf.int32)
return list_of_funcs[a](tensor)
然而,我不能索引list_of_funcs
作为TypeError: list indices must be integers or slices, not Tensor
。此外,我不能将这些函数(AFAIK)收集到tf.Tensor
中并使用tf.gather
。
所以我的问题是:如何在tf.function
中合理而整齐地从这些函数中采样?
您可以使用一堆嵌套的tf.cond
。如果满足某个条件,它将调用true_fn
或false_fn
。由于您有两个以上的函数,因此可以为任意多的函数嵌套它们。例如,根据随机变量的值,我正在编写将输入乘以2、3、4或5的函数。
import tensorflow as tf
x = 10
@tf.function
def mult_2():
tf.print(f'i was 2, returning {x} multiplied by 2')
return tf.multiply(x, 2)
@tf.function
def mult_3():
tf.print(f'i was 3, returning {x} multiplied by 3')
return tf.multiply(x, 3)
@tf.function
def mult_4():
tf.print(f'i was 4, returning {x} multiplied by 4')
return tf.multiply(x, 4)
@tf.function
def mult_5():
tf.print(f'i was 5, returning {x} multiplied by 5')
return tf.multiply(x, 5)
i = tf.random.uniform((), 1, 5, dtype=tf.int32)
tf.cond(i == 2, mult_2,
lambda: tf.cond(i == 3, mult_3,
lambda: tf.cond(i == 4, mult_4, mult_5)))
I was 3, returning 10 multiplied by 3
<tf.Tensor: shape=(), dtype=int32, numpy=30>
请注意,如果不满足任何条件,mult_5
将执行。
您可以使用tf.switch_case
像
def func1(tensor):
return tensor * 1
def func2(tensor):
return tensor * 2
def func24(tensor):
return tensor * 24
class Lambda:
def __init__(self, func, arg):
self._func = func
self._arg = arg
def __call__(self):
return self._func(self._arg)
@tf.function
def apply(tensor):
list_of_funcs = [func1, func2, func24]
branch_index = tf.random.uniform(shape=[], minval=0, maxval=len(list_of_funcs), dtype=tf.int32)
output = tf.switch_case(
branch_index=branch_index,
branch_fns=[Lambda(func, tensor) for func in list_of_funcs],
)
return output
Decorator@tf.function
仅需要用于您希望优化的整个函数,在本例中为apply
。如果在tf.data.Dataset.map
中使用apply
,则根本不需要装饰器。
看到这个讨论来理解为什么我们必须在这里定义Lambda
类。
也许可以尝试使用tf.py_function,其中:
将python函数封装到一个优先执行的TensorFlow op中。
例如(在Google Colab上测试):
import tensorflow as tf
import random
@tf.function
def func1(tensor):
print('func1')
return tensor
@tf.function
def func2(tensor):
print('func2')
return tensor
@tf.function
def func3(tensor):
print('func3')
return tensor
@tf.function
def func4(tensor):
print('func4')
return tensor
@tf.function
def apply(tensor):
dispatcher = {
'func1': func1,
'func2': func2,
'func3': func3,
'func4': func4
}
keys = list(dispatcher)
def get_random_function_and_apply(t):
return dispatcher[random.choice(keys)](t)
y = tf.py_function(func=get_random_function_and_apply, inp=[tensor], Tout=tf.float32)
return y
mirrored_strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
with mirrored_strategy.scope():
output = apply(tf.random.normal((5, 5, 5)))
print(output)
'''
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1')
func4
tf.Tensor(
[[[ 0.6041213 -2.054427 1.1755397 -0.62914884 -0.00978021]
[ 0.06134182 -1.5529596 -0.3429052 -0.03199977 -1.1796658 ]
[-0.65084136 -1.5009187 -0.43266404 -0.18494445 1.2958355 ]
[-1.6614605 -0.7398612 1.5384725 -0.24926051 -0.5075399 ]
[ 0.7781286 -0.4102168 1.2152135 0.4508075 -1.7295381 ]]
[[-1.0509509 -1.271087 1.9061071 0.61855525 0.58581835]
[ 2.080663 0.43406835 0.32372198 -0.71427256 0.04448809]
[-0.6438594 -1.1245041 -0.4723388 -0.8302859 -2.0056007 ]
[ 1.1778332 0.2977344 0.7516829 1.1387901 -0.71768486]
[-0.44642782 -0.6523012 -0.48157197 -0.8197472 0.3635474 ]]
[[-0.43357274 1.166849 -0.04528571 0.44322303 0.74193203]
[ 1.2332342 0.07857647 1.3399298 0.62153 1.835202 ]
[ 0.48021084 0.36239776 0.16630112 0.59010863 1.8134127 ]
[-1.1444335 1.2445287 -1.2320557 0.08095992 -0.1379302 ]
[-1.101756 -1.8099649 0.18504284 0.15212883 0.33380997]]
[[-0.68228734 -0.82357454 -0.744171 -0.04959428 -1.3200126 ]
[ 0.813062 1.0669035 -0.7924809 -0.0548021 0.8043163 ]
[ 1.6480085 -0.17134379 0.25517386 0.02731211 1.2226027 ]
[-1.9785942 -0.22399756 -0.6814836 1.2065881 -1.7922156 ]
[-0.34833568 -1.0567352 1.5795225 0.14899854 0.5924402 ]]
[[-1.057639 -1.1659449 -0.22045298 0.39324322 -1.3500952 ]
[-0.32044935 0.9534627 0.40809664 -1.0296333 -0.8129102 ]
[-0.13515176 -0.32676768 -0.9333701 0.35130095 -1.5411847 ]
[ 2.090785 0.3497966 0.27694222 0.78199005 -0.08591356]
[ 0.9621986 -2.3930101 -1.1035724 0.27208164 -1.1846163 ]]], shape=(5, 5, 5), dtype=float32)
'''