在火炬中的 MPS 上向后调用 () 时"derivative for aten::linear_backward is not implemented"

我正在研究GAN来生成声音。我从wavegan-pytorch github中复制了大部分代码。我用的是M2内核的MacBook，所以我想把处理器从cpu转移到gpu上。但是当我在我的损失上调用torch.Tensor.backward()时，我得到了一个错误，即linear_backward没有实现。我仍然是很新的编程，有一个简单的错误，我忽略了，或者只是不可能在gpu上运行代码?下面是我的代码:

real_signal = next(self.train_loader)
# need to add mixed signal and flag
noise = sample_noise(batch_size * generator_batch_size_factor)
generated = self.generator(noise)
#############################
# Calculating discriminator loss and updating discriminator
#############################
self.apply_zero_grad()
disc_cost, disc_wd = self.calculate_discriminator_loss(
real_signal.data, generated.data
)
assert not (torch.isnan(disc_cost))
disc_cost.backward()
self.optimizer_d.step()

很乐意得到帮助。让我知道，如果你需要更多的信息，我很抱歉提前，如果有一个简单的解决方案，我不知道，因为我是新手。

下面是calculate_discriminator_loss()函数的代码:

def calculate_discriminator_loss(self, real, generated):
disc_out_gen = self.discriminator(generated)
disc_out_real = self.discriminator(real)
alpha = torch.FloatTensor(batch_size * 2, 1, 1).uniform_(0, 1).to(device)
alpha = alpha.expand(batch_size * 2, real.size(1), real.size(2))
interpolated = (1 - alpha) * real.data + (alpha) * generated.data[:batch_size * 2]
interpolated = Variable(interpolated, requires_grad=True)
# calculate probability of interpolated examples
prob_interpolated = self.discriminator(interpolated)
grad_inputs = interpolated
ones = torch.ones(prob_interpolated.size()).to(device)
gradients = grad(
outputs=prob_interpolated,
inputs=grad_inputs,
grad_outputs=ones,
create_graph=True,
retain_graph=True,
only_inputs=True,
)[0]
# calculate gradient penalty
grad_penalty = (
p_coeff
* ((gradients.view(gradients.size(0), -1).norm(2, dim=1) - 1) ** 2).mean()
)
assert not (torch.isnan(grad_penalty))
assert not (torch.isnan(disc_out_gen.mean()))
assert not (torch.isnan(disc_out_real.mean()))
cost_wd = disc_out_gen.mean() - disc_out_real.mean()
cost = cost_wd + grad_penalty
return cost, cost_wd

看到您正在实现WGAN-GP的鉴别器损失计算，我想我应该找出问题所在并改进您的工作。

首先，你做得非常好，只是这里和那里有一些小瑕疵。问题确实存在于calculate_discriminator_loss函数中。需要改进的地方:

Variable在最新版本的Pytorch中已弃用。我不建议使用它，因为它不支持。
您可以索引generated和real张量而不访问data属性，如下所示:' generated[:batch_size * 2]
我不确定你想用batch_size * 2做什么。生成的批数据是否大于实际的批数据?我建议保持相同的尺寸。

expand_as

expand

当计算渐变时，你不需要retain_graph=True，因为你不需要计算两次渐变。
计算梯度时，不需要only_inputs=True。已弃用，默认设置为True。
p_coeff和device不是函数中定义的变量。确保在类中定义它们，然后通过self.p_coeff和self.device访问它们。

当我运行它时，以下工作:

def calculate_discriminator_loss(self, real, generated):
assert real.shape == generated.shape
disc_out_gen = self.discriminator(generated)
disc_out_real = self.discriminator(real)
alpha = torch.rand(self.batch_size, 1).to(self.device)
alpha = alpha.expand_as(real)
interpolated = (1 - alpha) * real + alpha * generated
# calculate probability of interpolated examples
prob_interpolated = self.discriminator(interpolated)
ones = torch.ones(prob_interpolated.size()).to(self.device)
gradients = grad(
outputs=prob_interpolated,
inputs=interpolated,
grad_outputs=ones,
create_graph=True)[0]
# calculate gradient penalty
grad_penalty = (
torch.mean((gradients.view(gradients.size(0), -1).norm(2, dim=1) - 1) ** 2)
)
cost_wd = disc_out_gen.mean() - disc_out_real.mean()
cost = cost_wd + grad_penalty
return cost, cost_wd

清理了你的代码，使其更易于阅读，并删除了断言。

希望对你有帮助。

相关内容

最新更新

热门标签：