在安卓系统中为pytorch预处理视频



在Android Kotlin中预处理视频数据的最佳方法是什么,为PyTorch Android模型做准备?具体来说,我在PyTorch中有一个现成的模型,我已经将其转换为PyTorch Mobile。

在训练过程中,该模型从手机中获取原始镜头,并进行预处理,以(1(进行灰度处理,(2(压缩到我指定的特定较小分辨率,(3(转换为张量,输入神经网络(或可能将压缩视频发送到远程服务器(。我使用OpenCV,但我想知道在Android Kotlin中最简单的方法是什么?

供参考的Python代码:


def save_video(filename):
frames = []
cap = cv2.VideoCapture(filename)
frameCount = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
frameWidth = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frameHeight = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
buf_c = np.empty((frameCount, frameHeight, frameWidth, 3), np.dtype('uint8'))
buf = np.empty((frameCount, frameHeight, frameWidth), np.dtype('uint8'))
fc = 0
ret = True
# 9:16 ratio
width = 121
height = 216
dim = (width, height)
# Loop until the end of the video
while fc < frameCount and ret:
ret, buf_c[fc] = cap.read()
# convert to greyscale
buf[fc] = cv2.cvtColor(buf_c[fc], cv2.COLOR_BGR2GRAY)
# reduce resolution
resized = cv2.resize(buf[fc], dim, interpolation = cv2.INTER_AREA)
frames.append(resized)
fc += 1
# release the video capture object
cap.release()
# Closes all the windows currently opened.
cv2.destroyAllWindows()
return frames

您说过您的模型已经转换为PyTorch Mobile,所以我假设您使用TorchScript编写了模型的脚本。

使用TorchScript,您可以使用Torch操作编写预处理逻辑,并将其保存在脚本模型中,如下所示:

import torch
import torch.nn.functional as F
@torch.jit.script_method
def preprocess(self,
image: torch.Tensor, # This should have format HxWx3
height: int,
width: int) -> torch.Tensor:
img = image.to(self.device)
# (1) Convert to Grayscale
img = ((img[:, :, 0] + img[:, :, 1] + img[:, :, 2]) / 3).unsqueeze(-1)
# (2) Resize to specified resolution
# Mimic torchvision.transforms.ToTensor to use interpolate
img = img.float()
img = img.permute(2, 0, 1).unsqueeze(0)
img = F.interpolate(img, size=(
height, width), mode="bicubic", align_corners=False)
img = img.squeeze(0).permute(1, 2, 0)
# Then turn it back to normal image tensor
# (3) Other normalization like mean substraction and convert to BxCxHxW format
img -= self.mean_tensor  # mean substraction
img = img.permute(2, 0, 1).unsqueeze(0)
return img

因此,所有的预处理都将由libtorch完成,而不是由opencv完成。

最新更新