理解张量点

在我学会了如何使用einsum之后，我现在试图了解np.tensordot是如何工作的。

但是，我有点迷茫，尤其是关于参数axes的各种可能性。

为了理解它，因为我从未练习过张量微积分，我使用以下示例：

A = np.random.randint(2, size=(2, 3, 5))
B = np.random.randint(2, size=(3, 2, 4))

在这种情况下，有哪些不同的可能np.tensordot以及如何手动计算它？

tensordot的想法非常简单 - 我们输入数组和相应的轴，这些轴将沿着这些轴进行求和减少。参与求和约简的轴在输出中被删除，输入数组中的所有剩余轴作为输出中的不同轴展开，保持输入数组的进给顺序。

让我们看几个具有一个和两个求和减少轴的示例案例，并交换输入位置，看看顺序是如何在输出中保持的。

我。求和减少的一个轴

输入：

 In [7]: A = np.random.randint(2, size=(2, 6, 5))
   ...:  B = np.random.randint(2, size=(3, 2, 4))
   ...:

案例#1：

In [9]: np.tensordot(A, B, axes=((0),(1))).shape
Out[9]: (6, 5, 3, 4)
A : (2, 6, 5) -> reduction of axis=0
B : (3, 2, 4) -> reduction of axis=1
Output : `(2, 6, 5)`, `(3, 2, 4)` ===(2 gone)==> `(6,5)` + `(3,4)` => `(6,5,3,4)`

情况 #

2（与情况 #1 相同，但输入是交换的）：

In [8]: np.tensordot(B, A, axes=((1),(0))).shape
Out[8]: (3, 4, 6, 5)
B : (3, 2, 4) -> reduction of axis=1
A : (2, 6, 5) -> reduction of axis=0
Output : `(3, 2, 4)`, `(2, 6, 5)` ===(2 gone)==> `(3,4)` + `(6,5)` => `(3,4,6,5)`.

二、总和约简的两个轴心

输入：

In [11]: A = np.random.randint(2, size=(2, 3, 5))
    ...: B = np.random.randint(2, size=(3, 2, 4))
    ...:

案例#1：

In [12]: np.tensordot(A, B, axes=((0,1),(1,0))).shape
Out[12]: (5, 4)
A : (2, 3, 5) -> reduction of axis=(0,1)
B : (3, 2, 4) -> reduction of axis=(1,0)
Output : `(2, 3, 5)`, `(3, 2, 4)` ===(2,3 gone)==> `(5)` + `(4)` => `(5,4)`

案例#2：

In [14]: np.tensordot(B, A, axes=((1,0),(0,1))).shape
Out[14]: (4, 5)
B : (3, 2, 4) -> reduction of axis=(1,0)
A : (2, 3, 5) -> reduction of axis=(0,1)
Output : `(3, 2, 4)`, `(2, 3, 5)` ===(2,3 gone)==> `(4)` + `(5)` => `(4,5)`

我们可以将其扩展到尽可能多的轴。

tensordot交换轴并重塑输入，以便它可以np.dot应用于 2 个 2D 数组。然后，它会交换并重新调整回目标。实验可能比解释更容易。没有特殊的张量数学，只是扩展dot以在更高维度上工作。 tensor 仅表示超过 2d 的数组。如果您已经对einsum感到满意，那么将结果与此进行比较将是最简单的。

在 1 对轴上求和的示例测试

In [823]: np.tensordot(A,B,[0,1]).shape
Out[823]: (3, 5, 3, 4)
In [824]: np.einsum('ijk,lim',A,B).shape
Out[824]: (3, 5, 3, 4)
In [825]: np.allclose(np.einsum('ijk,lim',A,B),np.tensordot(A,B,[0,1]))
Out[825]: True

另一个，召唤两个。

In [826]: np.tensordot(A,B,[(0,1),(1,0)]).shape
Out[826]: (5, 4)
In [827]: np.einsum('ijk,jim',A,B).shape
Out[827]: (5, 4)
In [828]: np.allclose(np.einsum('ijk,jim',A,B),np.tensordot(A,B,[(0,1),(1,0)]))
Out[828]: True

我们可以对(1,0)对做同样的事情。考虑到维度的混合，我认为没有另一种组合。

上面的

答案很棒，对我理解tensordot有很大帮助。但它们并没有显示操作背后的实际数学。这就是为什么我在 TF 2 中为自己做了等效的操作，并决定在这里分享它们：

a = tf.constant([1,2.])
b = tf.constant([2,3.])
print(f"{tf.tensordot(a, b, 0)}t tf.einsum('i,j', a, b)tt- ((the last 0 axes of a), (the first 0 axes of b))")
print(f"{tf.tensordot(a, b, ((),()))}t tf.einsum('i,j', a, b)tt- ((() axis of a), (() axis of b))")
print(f"{tf.tensordot(b, a, 0)}t tf.einsum('i,j->ji', a, b)t- ((the last 0 axes of b), (the first 0 axes of a))")
print(f"{tf.tensordot(a, b, 1)}tt tf.einsum('i,i', a, b)tt- ((the last 1 axes of a), (the first 1 axes of b))")
print(f"{tf.tensordot(a, b, ((0,), (0,)))}tt tf.einsum('i,i', a, b)tt- ((0th axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (0,0))}tt tf.einsum('i,i', a, b)tt- ((0th axis of a), (0th axis of b))")
[[2. 3.]
 [4. 6.]]    tf.einsum('i,j', a, b)     - ((the last 0 axes of a), (the first 0 axes of b))
[[2. 3.]
 [4. 6.]]    tf.einsum('i,j', a, b)     - ((() axis of a), (() axis of b))
[[2. 4.]
 [3. 6.]]    tf.einsum('i,j->ji', a, b) - ((the last 0 axes of b), (the first 0 axes of a))
8.0          tf.einsum('i,i', a, b)     - ((the last 1 axes of a), (the first 1 axes of b))
8.0          tf.einsum('i,i', a, b)     - ((0th axis of a), (0th axis of b))
8.0          tf.einsum('i,i', a, b)     - ((0th axis of a), (0th axis of b))

对于(2,2)形状：

a = tf.constant([[1,2],
                 [-2,3.]])
b = tf.constant([[-2,3],
                 [0,4.]])
print(f"{tf.tensordot(a, b, 0)}t tf.einsum('ij,kl', a, b)t- ((the last 0 axes of a), (the first 0 axes of b))")
print(f"{tf.tensordot(a, b, (0,0))}t tf.einsum('ij,ik', a, b)t- ((0th axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (0,1))}t tf.einsum('ij,ki', a, b)t- ((0th axis of a), (1st axis of b))")
print(f"{tf.tensordot(a, b, 1)}t tf.matmul(a, b)tt- ((the last 1 axes of a), (the first 1 axes of b))")
print(f"{tf.tensordot(a, b, ((1,), (0,)))}t tf.einsum('ij,jk', a, b)t- ((1st axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (1, 0))}t tf.matmul(a, b)tt- ((1st axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, 2)}t tf.reduce_sum(tf.multiply(a, b))t- ((the last 2 axes of a), (the first 2 axes of b))")
print(f"{tf.tensordot(a, b, ((0,1), (0,1)))}t tf.einsum('ij,ij->', a, b)tt- ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))")
[[[[-2.  3.]
   [ 0.  4.]]
  [[-4.  6.]
   [ 0.  8.]]]
 [[[ 4. -6.]
   [-0. -8.]]
  [[-6.  9.]
   [ 0. 12.]]]]  tf.einsum('ij,kl', a, b)   - ((the last 0 axes of a), (the first 0 axes of b))
[[-2. -5.]
 [-4. 18.]]      tf.einsum('ij,ik', a, b)   - ((0th axis of a), (0th axis of b))
[[-8. -8.]
 [ 5. 12.]]      tf.einsum('ij,ki', a, b)   - ((0th axis of a), (1st axis of b))
[[-2. 11.]
 [ 4.  6.]]      tf.matmul(a, b)            - ((the last 1 axes of a), (the first 1 axes of b))
[[-2. 11.]
 [ 4.  6.]]      tf.einsum('ij,jk', a, b)   - ((1st axis of a), (0th axis of b))
[[-2. 11.]
 [ 4.  6.]]      tf.matmul(a, b)            - ((1st axis of a), (0th axis of b))
16.0    tf.reduce_sum(tf.multiply(a, b))    - ((the last 2 axes of a), (the first 2 axes of b))
16.0    tf.einsum('ij,ij->', a, b)          - ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))

除了上述答案之外，如果将np.tensordot分解为嵌套循环，则更容易理解：

假设：

import numpy as np
a = np.arange(24).reshape(2,3,4)
b = np.arange(30).reshape(3,5,2)

那么 a 是

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],
       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

而 b 是

array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9]],
       [[10, 11],
        [12, 13],
        [14, 15],
        [16, 17],
        [18, 19]],
       [[20, 21],
        [22, 23],
        [24, 25],
        [26, 27],
        [28, 29]]])

让

c = np.tensordot(a, b, axes=([0,1],[2,0]))

相当于

c = np.zeros((4,5))
for i in range(4):
    for j in range(5):
        for p in range(2):
            for q in range(3):
                c[i,j] += a[p,q,i] * b[q,j,p]

两个张量中具有相同维度的轴（这里是 2 和 3）可以减少它们的总和。参数 axes=（[0,1]，[2,0]）与 axes=（[1,0]，[0,2]）相同。

最后的 c 是

array([[ 808,  928, 1048, 1168, 1288],
       [ 871, 1003, 1135, 1267, 1399],
       [ 934, 1078, 1222, 1366, 1510],
       [ 997, 1153, 1309, 1465, 1621]])

我。求和减少的一个轴

二、总和约简的两个轴心

相关内容

最新更新

热门标签：