从下面的图片可以看出这个问题
图像生成的每个时代截图每批图像相同但不相同,例如只有5%的pizels有一点不同
在每个epoch结束时发送给保存图像的生成器的随机噪声的潜在大小为torch.randn(Batch_size,Latent_size,1,1)
但似乎生成模型强迫自己不考虑每个图像的随机噪声输入是不同的,相反,随机噪声在不同时期是恒定的,但图像的生成却在不断变化。
代码:
#Data Preparation
import torchvision.transforms as tt
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
#mean = (0.6528, 0.4833, 0.4052)
#deviation = (0.2362, 0.2079, 0.1988)
ROOT = "./data/Gender/train"
BATCH_SIZE = 120
LATENT_SIZE = 100
train_ds = ImageFolder(ROOT, tt.Compose([tt.RandomHorizontalFlip(),tt.Resize(64),tt.ToTensor(),tt.Normalize((0.6528, 0.4833, 0.4052),(0.2362, 0.2079, 0.1988))]))
train_dl = DataLoader(train_ds, BATCH_SIZE, shuffle=1, num_workers=2, pin_memory=1)
#Generator
G = nn.Sequential(
nn.ConvTranspose2d(LATENT_SIZE,800,kernel_size=4,stride=2,padding=1,bias=False),#1000 2 2
nn.BatchNorm2d(800),
nn.ReLU(),
nn.ConvTranspose2d(800,400,kernel_size=4,stride=2,padding=1,bias=False),#600 4 4
nn.BatchNorm2d(400),
nn.ReLU(),
nn.ConvTranspose2d(400,200,kernel_size=4,stride=2,padding=1,bias=False),#300 8 8
nn.BatchNorm2d(200),
nn.ReLU(),
nn.ConvTranspose2d(200,100,kernel_size=4,stride=2,padding=1,bias=False),#150 16
nn.BatchNorm2d(100),
nn.ReLU(),
nn.ConvTranspose2d(100,50,kernel_size=4,stride=2,padding=1,bias=False),#50 32
nn.BatchNorm2d(50),
nn.ReLU(),
nn.ConvTranspose2d(50,3,kernel_size=4,stride=2,padding=1,bias=False),#3 64
nn.Tanh()
)
G = G.to(device)
#Generator Traing Function
def g_fit():
g_opt.zero_grad()
rand_g = torch.randn(BATCH_SIZE, LATENT_SIZE,1,1,dtype=torch.float32).to(device)
fake_images = G(rand_g)
g_loss = loss_fn(D(fake_images), real_labels)
g_loss.backward()
g_opt.step()
return g_loss
#Saving Images
from torchvision.utils import save_image
from PIL import Image
gen = torch.randn(24,LATENT_SIZE,1,1).to(device)
def save_fake_images(epoch):
with torch.no_grad():
fake_images = G(gen)
name = "./drive/MyDrive/gans_data/fake_n_images_"+str(epoch)+".png"
#Denorming Images
for i in range(len(fake_images)):
fake_images[i][0]=(fake_images[i][0]*0.2362) +0.6528
fake_images[i][1]=(fake_images[i][1]*0.2079) +0.4833
fake_images[i][2]=(fake_images[i][2]*0.1988) +0.4052
save_image(fake_images, name, nrow=6)
#this function is being used at the end of every training loop to save a small batch of images
这是gan训练中常见的问题,称为模式崩溃。请看看如何解决这个问题的一些信息:
https://developers.google.com/machine-learning/gan/problems mode-collapsehttps://machinelearningmastery.com/practical-guide-to-gan-failure-modes/
就个人经验而言,我发现训练gan非常善变,有时某些东西会工作一次,然后在下一次训练运行中不起作用,或者某些参数的微小变化会对生成器的性能产生很大影响。不断尝试新事物,看看结果如何。