空间域卷积不等于使用pytorch的频域乘法



我想验证空间域的2D卷积是否真的是频域的乘法,所以我使用pytorch实现了带有3×3内核的图像卷积(都是实数)。然后我把图像和核变换到频域,相乘,再把结果变换回空间域。以下是结果:
当内核是偶数或奇数时(即在频域纯实数或纯虚数),两个变换的结果似乎匹配得很好。我使用两个结果的最小值和最大值来评估,因为我不确定是否有一些边距对齐问题可能会影响直接差异。下面是偶数和奇数内核的三次运行:

# Even Kernel
min max with s-domain conv: -0.03659552335739136 4.378755569458008
min max with f-domain mul: -0.0365956649184227 4.378755569458008
min max with s-domain conv: -1.2673343420028687 2.397951126098633
min max with f-domain mul: -1.2673344612121582 2.397951126098633
min max with s-domain conv: -8.185677528381348 0.22980886697769165
min max with f-domain mul: -8.185677528381348 0.22980868816375732
# Odd Kernel
min max with s-domain conv: -1.6630988121032715 1.6592578887939453
min max with f-domain mul: -1.663098692893982 1.6592577695846558
min max with s-domain conv: -3.483165979385376 3.4751217365264893
min max with f-domain mul: -3.483165979385376 3.475121259689331
min max with s-domain conv: -1.7972984313964844 1.7931475639343262
min max with f-domain mul: -1.7972984313964844 1.7931475639343262

但是如果我既不使用偶数内核也不使用奇数内核,那么差异只是在另一个层次上:

min max with s-domain conv: -2.3028392791748047 1.675748348236084
min max with f-domain mul: -2.5289478302001953 1.4919483661651611
min max with s-domain conv: -1.1227827072143555 3.0336122512817383
min max with f-domain mul: -1.1954418420791626 2.9853036403656006
min max with s-domain conv: -1.6867876052856445 5.575590133666992
min max with f-domain mul: -1.6832940578460693 5.688591957092285

我想知道这是否来自浮点精度。但我试了一下火炬的综合体,也好不到哪里去。我的实现有什么问题吗?还是因为计算复数而不可避免?

下面是我的代码的简化版本,它可以产生这个结果。

import torch.nn.functional as F
import torch.fft as fft
import torch, cv2
img = cv2.imread('test.png', 0)
x = torch.as_tensor(img).unsqueeze(0)/255
k = torch.randn(1, 1, 3, 3)
for i in range(k.size(0)):
for j in range(k.size(1)):
# For even k
# for p in range(k.size(2)):
# for q in range(k.size(3)):
# k[i, j, p, q] = k[i, j, 2-p, 2-q]
# For odd k
# for p in range(k.size(2)):
# k[i, j, p, 0] = -k[i, j, p, 2]
# k[i, j, p, 1] = 0
# for q in range(k.size(3)):
# k[i, j, 0, q] = -k[i, j, 2, q]
# k[i, j, 1, q] = 0
pass
### Spatial domain convolution
padx = F.pad(x, [1,1,1,1])
sdc = F.conv2d(padx.unsqueeze(0), k)
### Frequency domain convolution
# Transform input
fdx = fft.rfft2(x)
sdfdx = fft.irfft2(fdx)
# Transform kernel
size_diff = x.size(-1)-k.size(-1)
padk = torch.roll(F.pad(k, [0,size_diff,0,size_diff]), (-1,-1), (-1, -2))
fdk = fft.rfft2(padk)
# Frequency domain multiplication
fdc = fdk * fdx
fdc = fdc.squeeze(0)
# Back to spatial domain
sdfdc = fft.irfft2(fdc)
### Compare
print("min max with s-domain conv:", sdc.min().item(), sdc.max().item())
print("min max with f-domain mul:", sdfdc.min().item(), sdfdc.max().item())

胡乱猜测,但聊胜于无

在卷积之前试着调换你的过滤器。

因为你知道数学,没有转置滤波器的卷积实际上是一种关联,所以你做转置滤波器使这种关联变成实际的卷积。

最新更新