我做了一个内核:
__kernel void square(
__global uchar* input,
__global uchar* output,
const unsigned int count)
{
int i = get_global_id(0);
if(i < count)
output[i] = input[i] * input[i];
};
我的程序的输出是,输入矩阵和输出列出。我看到所有工作项的评估值为 val*val % 256。为什么?
Found 1 platform(s).
OutData from Write and Read:
0
[91, 2, 79, 179, 52, 205, 236, 8;
181, 239, 26, 248, 207, 218, 45, 183;
158, 101, 102, 18, 118, 68, 210, 139;
198, 207, 211, 181, 162, 197, 191, 196;
40, 7, 243, 230, 45, 6, 48, 173;
242, 125, 175, 90, 63, 90, 22, 112;
221, 167, 224, 113, 208, 123, 214, 35;
229, 6, 143, 138, 98, 81, 118, 187;
167, 140, 218, 178, 23, 43, 133, 154;
150, 76, 101, 8, 38, 238, 84, 47]
[89, 4, 97, 41, 144, 41, 144, 64;
249, 33, 164, 64, 97, 164, 233, 209;
132, 217, 164, 68, 100, 16, 68, 121;
36, 97, 233, 249, 132, 153, 129, 16;
64, 49, 169, 164, 233, 36, 0, 233;
196, 9, 161, 164, 129, 164, 228, 0;
201, 241, 0, 225, 0, 25, 228, 201;
217, 36, 225, 100, 132, 161, 100, 153;
241, 144, 164, 196, 17, 57, 25, 164;
228, 144, 217, 64, 164, 68, 144, 161]
OutData from Write and Read:
8913
Succed
Press any key to continue . . .
答案是操作在 GPU 上溢出,当使用 uchar 类型时,256 再次变为 0,循环继续。