使用thrust::reduce在不溢出的情况下计算8位整数的向量上的和

我有一个类型为uint8_t的设备向量，如果可能的话，我想使用thrust::reduce计算它的和。问题是我得到了溢出，因为总和将远远大于255。我原以为下面的代码会通过将结果存储为32位整数来计算和，但事实并非如此。有什么好方法可以做到这一点吗？

uint8_t * flags_d;
...
const int32_t N_CMP_BLOCKS = thrust::reduce( 
thrust::device_pointer_cast( flags_d ), 
thrust::device_pointer_cast( flags_d ) + N,
(int32_t) 0,
thrust::plus<int32_t>() );

我认为唯一可行的解决方案是使用thrust::transform_reduce将8位输入数据显式转换为减少中的累积操作之前的32位数量。所以我期待这样的东西：

#include <thrust/transform_reduce.h>
#include <thrust/functional.h>
#include <thrust/execution_policy.h>
template<typename T1, typename T2>
struct char2int
{
__host__ __device__ T2 operator()(const T1 &x) const
{
return static_cast<T2>(x);
}
};
int main()
{
unsigned char data[6] = {128, 100, 200, 102, 101, 123};
int result = thrust::transform_reduce(thrust::host,
data, data + 6,
char2int<unsigned char,int>(),
0,
thrust::plus<int>());
std::cout << "Result is " << result << std::endl;

return 0;
}

更像你想的那样。

相关内容

最新更新

热门标签：