在3D中指定CUDA内核中的线程Id



我正在努力将2D代码转换为3D实现

在2D中我有以下内容:

int row_number = blockIdx.y * blockDim.y + threadIdx.y
int column_number = blockIdx.x * blockDim.x + threadIdx.x
int threadId = row_number * grid_dimension + column_number

我想让这个在3D中工作

int row_number = 
int column_number =
int depth_number = 
int threadId = row_number * grid_dimension + column_number + depth_number * grid_dimension * grid_dimension 

我的第一次尝试是:

int row_number = blockIdx.y * blockDim.y + threadIdx.y
int column_number = blockIdx.x * blockDim.x + threadIdx.x
int depth_number = blockIdx.z * blockDim.z + threadIdx.z
int threadId = row_number * grid_dimension + column_number + depth_number * grid_dimension * grid_dimension 

是我的表达式threadId在3D正确,如果不是我如何得到行,列和深度数字在3D?我看到的表达式,只是找到blockId和threadId直接,但这不是我一直在寻找什么。如果这不是问题,我可能还有其他问题需要调查。

谢谢。

您还没有定义grid_dimension的含义。

当然这些公式是很好的获得行/列/深度索引:

unsigned int row_number = blockIdx.y * blockDim.y + threadIdx.y;
unsigned int column_number = blockIdx.x * blockDim.x + threadIdx.x;
unsigned int depth_number = blockIdx.z * blockDim.z + threadIdx.z;

要从上述变量构建一个全局唯一的线程id,可以使用:

unsigned long long idx = column_number + (row_number * gridDim.x) + (depth_number * (gridDim.x * gridDim.y); 

最新更新