我使用nppiBGRToYCbCr420_8u_C3P3R
将RGB图像转换为YUV420。功能参数如下:
nppiBGRToYCbCr420_8u_C3P3R(const Npp8u pSrc, int nSrcStep, Npp8u pDst[3], int rDstStep[3], NppiSize oSizeROI)
我想把d_array[0]
到host_array
复制到imshow Y通道图像并检查它,但我发现nppiBGRToYCbCr420_8u_C3P3R
返回错误NPP_STEP_error"(间距为921600 BGR。步长为4096(在opencv中,图像步长为2的N次方)。所以我希望有人能帮我。
这里有两个主要问题:
nppiBGRToYCbCr420_8u_C3P3R
将具有交错的BGR像素值的BGR图像转换为一个Y图像、一个Cb图像和一个Cr图像。即图像在三个分离的平面中输出,因此P在"C3P3"中- 由于420编码,颜色信息被二次采样,这意味着Cb和Cr的图像平面只有原始图像的一半大小
使用nppiMalloc_8u_C1来分配设备输出图像会给出类似的结果(为了简单起见,省略了错误检查,并在浏览器中编写而不进行检查):
Mat temp = imread("1.jpg",1);
Npp8u *d_arrayY, *d_arrayCB, *d_arrayCR;
GpuMat BGR(temp);
unsigned char *host_array = (unsigned char*)malloc(temp.cols * temp.rows * sizeof(unsigned char ));
memset(host_array,0,temp.cols * temp.rows * sizeof(unsigned char));
size_t pitchY, pitchCB, pitchCR ;
d_arrayY = nppiMalloc_8u_C1(temp.cols, temp.rows, &pitchY);
d_arrayCB = nppiMalloc_8u_C1(temp.cols/2, temp.rows/2, &pitchCB);
d_arrayCR = nppiMalloc_8u_C1(temp.cols/2, temp.rows/2, &pitchCR);
int Dstep[3] = {pitchY,pitchCB,pitchCR};
Npp8u* d_ptrs[3] = {d_arrayY, d_arrayCB, d_arrayCR};
NppiSize ds;
ds.height = temp.rows;
ds.width = temp.cols;
nppiBGRToYCbCr420_8u_C3P3R(BGR.ptr<Npp8u>(), BGR.step, d_ptrs, Dstep, ds);
cudaMemcpy2D(host_array, temp.cols, d_arrayY, pitchY, temp.cols * sizeof(Npp8u), temp.rows, cudaMemcpyDeviceToHost);