对大数据进行预测的最有效方法?

我正在尝试编写一个预测函数，预测大型文本数据(因此必须按批处理)。但是预测函数有点慢。所以我想知道我能做些什么来提高它的时间。

目前为止我有什么:

def get_embeddings(model, data_loader, device):

model.eval()  # eval mode

with torch.no_grad():
embeddings = torch.tensor([], dtype=torch.float64, device=device)  #initialize empty tensor
for _, d in enumerate(tqdm(data_loader), 0):
# model inputs
input_ids = d["input_ids"].to(device)
attention_mask = d["attention_mask"].to(device)
# model outputs
embeddings = torch.cat((embeddings, model.predict(
input_ids,
attention_mask=attention_mask
)))  # concat predictions
return embeddings.cpu().numpy()  # convert to numpy array

我认为从GPU转换到CPU需要时间，所以我决定先初始化一个空张量，并连接所有的预测。然后在最后将其转换回numpy数组。但我不确定张量连接是否真的会更慢。

所以我想知道在预测方面是否有更好或最佳的实践。

你的代码中有几件事引起了我的注意:

您正在使用float64。GPUs的速度比float32和float16慢得多。
你可能会强迫从float32到float64的转换，这也可能很慢。
你正在为每个输入复制注意掩码到GPU。相反，创建和/或保留它。
确保所有的操作都是分批完成的，以充分利用GPU的并行性。
连接可能是无用的:GPU- CPU传输大多是带宽有限的，所以它不太重要，如果它是一个大张量或许多小张量(到某一点)。其次，分配(由cat引起)可能很慢。

相关内容

最新更新

热门标签：