如何定义允许差异化的自定义剧院类型



我正试图按照文档在Theano中创建一个double类型,并按照此处所述在该类型上实现操作。当前状态可以在下面找到:

import theano
class Double(theano.gof.Type):
def filter(self, value, strict = False, allow_downcast = None):
if strict:
# we need to return a type, but if the value is incompatible raise an exception
if isinstance(value, float):
return value
else:
raise TypeError('Expected a float!')
elif allow_downcast:
return float(value)
else:
value_float = float(value)
if value_float == value:
return value_float
else:
raise TypeError('The double type cannot be accurately represent %s of type %s' % (value, type(value)))
def values_eq_approx(self, value_a, value_b, tolerance = 1e-6):
return abs(value_a - value_b) / (abs(value_a) + abs(value_b)) < tolerance
double = Double()
class DoubleAddOp(theano.Op):
__props__ = ()
def make_node(self, x, y):
# check input types
if isinstance(x, (int, float)):
x = theano.gof.Constant(double, x)
if isinstance(y, (int, float)):
y = theano.gof.Constant(double, y)
if x.type != double or y.type != double:
raise TypeError('DoubleAddOp only works on doubles.')
return theano.gof.Apply(self, [x, y], [double()])
def perform(self, node, inputs, output_storage):
x = inputs[0]
y = inputs[1]
z = output_storage[0]
z[0] = x + y
def infer_shape(self, node, input_shapes):
return [input_shapes[0]]
def grad(self, inputs, output_grads):
return [output_grads[0]*1, output_grads[0]*1]
def __str__(self):
return 'DoubleAddOp'
dadd = DoubleAddOp()

为了测试代码,我写了几个单元测试:

import theano
import random
import unittest
from double import double, dadd
class TestDoubleOps(unittest.TestCase):
# the forward pass runs fine ...
def test_DoubleAddOpPerform(self):
x = double('x')
y = double('y')
z = dadd(x, y)
f = theano.function([x, y], z)
for i in range(100):
x_value = random.random()
y_value = random.random()
self.assertAlmostEqual(f(x_value, y_value), x_value + y_value)
# I am trying to get the gradient computation working here,
# this is what I have so far:
def test_DoubleAddOpGrad(self):
x = double('x')
y = double('y')
z = dadd(x, y)
gx = theano.tensor.grad(z, x) # <---
gy = theano.tensor.grad(z, y)
f = theano.function([x, y], [gx, gy])
for i in range(100):
x_value = random.random()
y_value = random.random()
print(f(x_value, y_value))
if __name__ == '__main__':
unittest.main()

然而,当测试梯度计算时,我在标记线处得到以下错误:

Traceback (most recent call last):
File "~/theano/double-type-python/double_test.py", line 32, in test_DoubleAddOpGrad
gx = theano.tensor.grad(z, x)
File "~/.local/lib/python3.5/site-packages/theano/gradient.py", line 436, in grad
if cost is not None and cost.ndim != 0:
AttributeError: 'Variable' object has no attribute 'ndim'

这似乎是上面定义的双重类型的问题。然而,类型本身就是比例,所以我应该能够使用theano.tensor.grad来计算梯度。不幸的是,我找不到一个示例来演示自定义类型上的梯度计算,也无法了解更多关于ndim属性的信息。。。

感谢任何帮助;谢谢

更新当试图欺骗theano.tensor.grad时,例如通过显式设置z.ndim = 0,问题会继续,例如

Traceback (most recent call last):
File "~/theano/double-type-python/double_test.py", line 33, in test_DoubleAddOpGrad
gx = theano.tensor.grad(z, x)
File "/usr/local/lib/python3.4/dist-packages/theano/gradient.py", line 477, in grad
g_cost = _float_ones_like(cost)
File "/usr/local/lib/python3.4/dist-packages/theano/gradient.py", line 1340, in _float_ones_like
dtype = x.type.dtype
AttributeError: 'Double' object has no attribute 'dtype'

因此,我似乎缺少了一些基本的东西,并且定义的Double类型缺少了文档中没有提到的几个不同类型的特定信息。

更新在重新阅读文档并查看Theano的源代码后,正确的问题是:是否可以在Theano中定义允许差异的自定义(非张量)类型?

更新根据nouiz的回答,我将遇到下一个问题-这些问题给我的印象是梯度计算不适用于非TensorType类型:

Traceback (most recent call last):
File "~/theano/double-type-python/double_test.py", line 32, in test_DoubleAddOpGrad
gx = theano.tensor.grad(z, x)
File "~/.local/lib/python3.5/site-packages/theano/gradient.py", line 477, in grad
g_cost = _float_ones_like(cost)
File "~/.local/lib/python3.5/site-packages/theano/gradient.py", line 1344, in _float_ones_like
return tensor.ones_like(x, dtype=dtype)
File "~/.local/lib/python3.5/site-packages/theano/tensor/basic.py", line 2377, in ones_like
return fill(model, ret)
File "~/.local/lib/python3.5/site-packages/theano/gof/op.py", line 604, in __call__
node = self.make_node(*inputs, **kwargs)
File "~/.local/lib/python3.5/site-packages/theano/tensor/elemwise.py", line 577, in make_node
inputs = list(map(as_tensor_variable, inputs))
File "~/.local/lib/python3.5/site-packages/theano/tensor/basic.py", line 171, in as_tensor_variable
"Variable type field must be a TensorType.", x, x.type)
theano.tensor.var.AsTensorError: ('Variable type field must be a TensorType.', DoubleAddOp.0, <double.Double object at 0x7fb623a5b9b0>)

答案是肯定的。你可以。对于稀疏变量和GPU变量,我们自己也这样做。

但你遇到了一些角落的情况,no.grad()并不是用来支持的。大多数情况下,它需要一个ndim参数和一个dtype参数。添加dtype="float64"参数应该可以解决这个问题。

ndim一是很容易在Theano修复这个差异:

diff --git a/theano/gradient.py b/theano/gradient.py
index 6d6fbaf..3b4d706 100644
--- a/theano/gradient.py
+++ b/theano/gradient.py
@@ -433,7 +433,7 @@ def grad(cost, wrt, consider_constant=None,
"cost is NaN because " +
cost.type.why_null)
-    if cost is not None and cost.ndim != 0:
+    if cost is not None and getattr(cost, 'ndim', 0) != 0:
raise TypeError("cost must be a scalar.")
if isinstance(wrt, set):

对于dtype,它更复杂,因为我们在许多地方使用它来进行验证(例如,你不能使用整数的梯度),也可以初始化梯度链(或者你可以通过known_grad参数传递它)

更新:这个更大的差异可以修复新的错误:

diff --git a/theano/gradient.py b/theano/gradient.py
index 6d6fbaf..6a9ec03 100644
--- a/theano/gradient.py
+++ b/theano/gradient.py
@@ -433,7 +433,7 @@ def grad(cost, wrt, consider_constant=None,
"cost is NaN because " +
cost.type.why_null)
-    if cost is not None and cost.ndim != 0:
+    if cost is not None and getattr(cost, 'ndim', 0) != 0:
raise TypeError("cost must be a scalar.")
if isinstance(wrt, set):
@@ -1341,7 +1341,7 @@ def _float_ones_like(x):
if dtype not in tensor.float_dtypes:
dtype = theano.config.floatX
-    return tensor.ones_like(x, dtype=dtype)
+    return x.ones_like(dtype=dtype)

class numeric_grad(object):
diff --git a/theano/tensor/var.py b/theano/tensor/var.py
index 2ecb9f0..6b08a45 100644
--- a/theano/tensor/var.py
+++ b/theano/tensor/var.py
@@ -727,6 +727,9 @@ class _tensor_py_operators(object):
def zeros_like(model, dtype=None):
return theano.tensor.basic.zeros_like(model, dtype=dtype)
+    def ones_like(model, dtype=None):
+        return theano.tensor.basic.ones_like(model, dtype=dtype)
+
def cumsum(self, axis=None):
return theano.tensor.extra_ops.cumsum(self, axis)

您需要将ones_like方法添加到您的Variable中,如下所示:def my_ones_like(型号,dtype=无):回来double.ones_like=my_ones_like

最新更新