用于核心ML自定义层的MTLexture数组的金属数据结构

我对CoreML自定义层的MTLTexture数组感到困惑。在我的 mlmodel 中，自定义层的输入 MTLTexture 有 32 个通道，输出有 8 个通道。 MTLTexture 的数据类型是 16 位浮点数或一半。因此，输入texture_array由 8 个切片组成，输出由 2 个切片组成。

func encode(commandBuffer: MTLCommandBuffer, inputs: [MTLTexture], outputs: [MTLTexture]) throws {
print(#function, inputs.count, outputs.count)
if let encoder = commandBuffer.makeComputeCommandEncoder() {
for i in 0..<inputs.count {
encoder.setTexture(inputs[i], index: 0)
encoder.setTexture(outputs[i], index: 1)
encoder.dispatch(pipeline: psPipeline, texture: inputs[i])
encoder.endEncoding()
}
}
}

在我的计算内核函数中

kernel void pixelshuffle(
texture2d_array<half, access::read> inTexture [[texture(0)]],
texture2d_array<half, access::write> outTexture [[texture(1)]],
ushort3 gid [[thread_position_in_grid]])
{
if (gid.x >= inTexture.get_width() || gid.y >= inTexture.get_height()
|| gid.z>=inTexture.get_array_size()){
return;
}
const half4 src = half4(inTexture.read(gid.xy, gid.z));
//do other things
}
)

如果输入和输出纹理数组为 [C][H][W]，对于 gid=(0,0,0(，src.rgba 存储在哪些通道中，其通道中的 rgba 坐标是什么？

是 src.r [0][0][0]， src.g[1][0][0]， src.b [2][0][0]， src.a [3][0][0] ？或是 src.r

[0][0][0]， src.g[0][0][1]， src.b [0][0][2]， src.a [0][0][3] ？如何在编码函数中获取输入纹理的原始数据并将其打印出来？

在计算内核中，src包含纹理中单个像素的 RGBA 值，每个值都是 16 位浮点数。

纹理的宽度对应于 W，纹理的高度为 H，纹理切片为 C，其中每个切片有 4 个通道。

所以纹理中的切片数等于C/4，gid.z从 0 到floor((C + 3)/4)。

(尽管这也取决于您的encoder.dispatch(pipeline:, texture:)函数的作用，因为这似乎不是MTLComputeCommandEncoder的标准方法。

这意味着src.r是切片中的第一个通道，.g是切片中的第二个通道，.b是第三个通道，.a切片中的第四个通道。第一个切片具有通道 0-3，第二个切片具有通道 4-7，依此类推。

所以你的第一个猜测是正确的：

src.r [0][0][0]， src.g[1][0][0]， src.b [2][0][0]

， src.a [3][0][0]

另请注意，我写了一篇关于 Core ML 中的自定义内核的博客文章，可能会很有用： http://machinethink.net/blog/coreml-custom-layers/

相关内容

最新更新

热门标签：