创建MLP模型来预测用户使用PyTorch对未观看电影的评分



在我的项目中,我试图根据用户对其他电影的评分来预测他对一部看不见的电影的评分。我使用的是movielens数据集。主文件夹ml-100k包含有关100000部电影的信息

在处理数据之前,主数据(评分数据(包含用户ID、电影ID、用户评分从0到5以及时间戳(此项目不考虑(。然后,我使用sklearn库将数据拆分为训练集(80%(和测试数据(20%(

为了创建推荐系统,正在使用模型"堆叠式自动编码器"。我使用PyTorch代码是在GoogleColab上实现的。该项目基于此https://towardsdatascience.com/stacked-auto-encoder-as-a-recommendation-system-for-movie-rating-prediction-33842386338

我是深度学习的新手,我想将此模型(Stacked_Autoencoder(与另一个深度学习模型进行比较。例如,我想使用多层感知(MLP(。这是为了研究目的。以下代码用于创建堆叠式自动编码器模型并训练该模型。

### Part 1 : Archirecture of the AutoEncoder 
#nn.Module is a parent class 
# SAE is a child class of the parent class nn.Module
class SAE(nn.Module): 
# self is the object of the SAE class 
# Archirecture 
def __init__(self, ): 
# self can use alll the methods of the class nn.Module
super(SAE,self).__init__()
# Full connected layer  n°1, input and 20 neurons-nodes of the first layer
# one neuron can be the genre of the movie

# Encode step 
self.fc1 = nn.Linear(nb_movies,20)
# Full connected layer n°2 
self.fc2 = nn.Linear(20,10)

# Decode step 
# Full connected layer n°3
self.fc3 = nn.Linear(10,20) 
# Full connected layer n°4
self.fc4 = nn.Linear(20,nb_movies) 
# Sigmoid activation function 
self.activation = nn.Sigmoid()
# Action : activation of the neurons
def forward(self, x) : 
x = self.activation(self.fc1(x))
x = self.activation(self.fc2(x))
x = self.activation(self.fc3(x))
# dont's use the activation function 
# use the linear function only 
x = self.fc4(x)
# x is th evector of predicted ratings
return x 
# Create the AutoEncoder object 
sae=SAE()
#MSE Loss : imported from torch.nn 
criterion=nn.MSELoss() 
# RMSProp optimizer (update the weights) imported from torch.optim 
#sea.parameters() are weights and bias adjusted during the training
optimizer=optim.RMSProp(sae.parameters(),lr=0.01, weight_decay=0.5)
### Part 2 : Training of the SAE 
# number of epochs 
nb_epochs = 200 
# Epoch forloop 
for epoch in range(1, nb_epoch+1): 
# at the beginning the loss is at zero
s=0.
train_loss = 0 
#Users forloop 
for id_user in range(nb_users)
# add one dimension to make a two dimension vector.
# create a new dimension and put it the first position .unsqueeze[0]
input = Variable(training_set[id_user].unsqueeze[0])

# clone the input to obtain the target  
target= input.clone()

# target.data are all the ratings 
# ratings > 0
if torch.sum(target.data >0) > 0
output = sae(input)
# don't compute the gradients regarding the target
target.require_grad=False 
# only deal with true ratings 
output[target==0]=0

# Loss Criterion 
loss =criterion(output,target)

# Average the error of the movies that don't have zero ratings
mean_corrector=nb_movies/float(torch.sum(target.data>0)+1e-10)

# Direction of the backpropagation 
loss.backward()
train_loss+=np.sqrt(loss.data[0]*mean_corrector)
s+=1.

# Intensity of the backpropagation 
optimizer.step()

print('epoch:' +str (epoch)+'loss:' +str(train_loss/s)

)

如果我想使用MLP模型进行训练。我如何实现这个类模型?此外,我还可以使用其他什么深度学习模型(除了MLP(与Stacked Autoencoder进行比较?

谢谢。

MLP不适合推荐。如果你想走这条路,你需要为你的userid和itemid创建一个嵌入,然后在嵌入的顶部添加线性层。你的目标将是预测一个userid-itemid对的评分。

我建议你看看变分自动编码器(VAE(。它们在推荐系统中提供了最先进的结果。它们还将与您的堆叠式自动编码器进行公平的比较。以下是将VAE应用于协同过滤的研究论文:https://arxiv.org/pdf/1802.05814.pdf

相关内容

  • 没有找到相关文章

最新更新