我正试图在不同用户的应用程序日志数据上构建一个LSTM。我有一个由用户的堆叠应用程序记录组成的大数据帧,例如,前1500行用于用户1,后500行用于用户2等。我现在想知道是否有可能以这样的方式训练LSTM,即在每个用户之后更新权重,这意味着在每次更新后更改批量大小。为了更好地理解:我希望LSTM首先获取用户1的1500行的所有记录,并在处理它们后更新权重,然后它应该获取用户2的500行,并在对它们进行处理后更新权重等。
我正在与Keras一起构建LSTM。
有可能这样做吗?
谢谢!
我不知道您的具体应用场景,但我假设它是时间序列预测。
构建LSTM模型:
class LSTM(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, output_size, batch_size):
super().__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.num_layers = num_layers
self.output_size = output_size
self.num_directions = 1
self.batch_size = batch_size
self.lstm = nn.LSTM(self.input_size, self.hidden_size, self.num_layers, batch_first=True)
self.linear = nn.Linear(self.hidden_size, self.output_size)
def forward(self, input_seq):
h_0 = torch.randn(self.num_directions * self.num_layers, self.batch_size, self.hidden_size).to(device)
c_0 = torch.randn(self.num_directions * self.num_layers, self.batch_size, self.hidden_size).to(device)
# print(input_seq.size())
seq_len = input_seq.shape[1]
# input(batch_size, seq_len, input_size)
input_seq = input_seq.view(self.batch_size, seq_len, self.input_size)
# output(batch_size, seq_len, num_directions * hidden_size)
output, _ = self.lstm(input_seq, (h_0, c_0))
# print('output.size=', output.size())
# print(self.batch_size * seq_len, self.hidden_size)
output = output.contiguous().view(self.batch_size * seq_len, self.hidden_size) # (5 * 30, 64)
pred = self.linear(output) # pred()
# print('pred=', pred.shape)
pred = pred.view(self.batch_size, seq_len, -1)
pred = pred[:, -1, :]
return pred
您可以使用DataLoader处理来自不同用户的数据,批量大小不同,以获得多个用户的数据集。
像这样:
class MyDataset(Dataset):
def __init__(self, data):
self.data = data
def __getitem__(self, item):
return self.data[item]
def __len__(self):
return len(self.data)
Dtr = DataLoader(dataset=train, batch_size=B, shuffle=False, num_workers=0)
Dte = DataLoader(dataset=test, batch_size=B, shuffle=False, num_workers=0)
然后,我们开始训练:
for t in range(len(users)):
# change batch size
b = batchsizes[t] # Store batch_size of each user in batchsizes
model = LSTM(input_size, hidden_size, num_layers, output_size, batch_size=b).to(device)
if t != 0:
model.load_state_dict(torch.load(LSTM_PATH)['model'])
model.train()
Dtr = Dtrs[t] # Store train data of each user in Dtrs
for i in range(epochs):
cnt = 0
for (seq, label) in Dtr:
cnt += 1
seq = seq.to(device)
label = label.to(device)
y_pred = model(seq)
loss = loss_function(y_pred, label)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if cnt % 100 == 0:
print('epoch', i, ':', cnt - 100, '~', cnt, loss.item())
# Save the current user's model after training
state = {'model': model.state_dict(), 'optimizer': optimizer.state_dict()}
torch.save(state, LSTM_PATH)
很抱歉,上面的代码不能直接工作,因为我不知道你的数据情况,所以我只是为你提供一个通用的框架。