编译具有动态并行性 (CUDA) 的 .cu 文件

我切换到带有cc 5.2的新GPU GeForce GTX 980，因此它必须支持动态并行性。但是，我甚至无法编译简单的代码（来自编程指南）。我不会在这里提供它（没有必要，只是一个全局内核调用另一个全局内核）。

1）我使用VS2013进行编码。在property pages -> CUDA C/C++ -> device中，我code generation属性更改为compute_35,sm_35，下面是输出：

1>------ Build started: Project: testCublas3, Configuration: Debug Win32 ------
1>  Compiling CUDA source file kernel.cu...
1>  
1>  C:programsmishacudaProjectstest projectstestCublas3testCublas3>"C:Program      FilesNVIDIA GPU Computing ToolkitCUDAv6.5binnvcc.exe" -gencode=arch=compute_35,code="sm_35,compute_35" --use-local-env --cl-version 2013 -ccbin "C:Program Files (x86)Microsoft Visual Studio 12.0VCbin"  -I"C:Program FilesNVIDIA GPU Computing ToolkitCUDAv6.5include" -I"C:Program FilesNVIDIA GPU Computing ToolkitCUDAv6.5include"  -G   --keep-dir Debug -maxrregcount=0  --machine 32 --compile -cudart static  -g   -DWIN32 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o Debugkernel.cu.obj "C:programsmishacudaProjectstest projectstestCublas3testCublas3kernel.cu" 
1>C:/programs/misha/cuda/Projects/test projects/testCublas3/testCublas3/kernel.cu(13): error : kernel launch from __device__ or __global__ functions requires separate compilation mode
1>  kernel.cu
1>C:Program Files (x86)MSBuildMicrosoft.Cppv4.0V120BuildCustomizationsCUDA 6.5.targets(593,9): error MSB3721: The command ""C:Program FilesNVIDIA GPU Computing ToolkitCUDAv6.5binnvcc.exe" -gencode=arch=compute_35,code="sm_35,compute_35" --use-local-env --cl-version 2013 -ccbin "C:Program Files (x86)Microsoft Visual Studio 12.0VCbin"  -I"C:Program FilesNVIDIA GPU Computing ToolkitCUDAv6.5include" -I"C:Program FilesNVIDIA GPU Computing ToolkitCUDAv6.5include"  -G   --keep-dir Debug -maxrregcount=0  --machine 32 --compile -cudart static  -g   -DWIN32 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd  " -o Debugkernel.cu.obj "C:programsmishacudaProjectstest projectstestCublas3testCublas3kernel.cu"" exited with code 2.

我想，我需要这个编译的另一个选项：-rdc=true，但我没有找到可以在VS2013中设置它的位置。

2）当我code generation属性设置为compute_52,sm_52时，出现错误：Unsupported gpu architecture 'compute_52'。但我的抄送是 5.2。所以我可以编译最大 3.5 cc 的代码？

谢谢

关于第 1 项，cuda 动态并行性需要单独的编译和链接（ -rdc=true ），以及链接设备 cudart 库（ -lcudadevrt ）。同样使用 CUBLAS 的动态并行性也需要在设备 CUBLAS 库（-lcublas_device）中进行链接。定义所有这些在 Visual Studio 项目中的位置的最简单方法可能是首先查看设备 cublas 示例的 Visual Studio 项目。

关于第 2 项，您的 GTX 980 计算能力 5.2 未被识别的原因是您需要 cuda 6.5 工具包的最新更新，该更新可在此处获得。

（请注意，cublas_device功能已从最新版本的 CUDA 中删除。

相关内容

最新更新

热门标签：