我正在尝试解决在运行CUDA示例中包含的simpleP2P示例程序时发现的错误。错误如下:
$ ./simpleP2P
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2
> GPU0 = " Tesla K20c" IS capable of Peer-to-Peer (P2P)
> GPU1 = " Tesla K20c" IS capable of Peer-to-Peer (P2P)
Checking GPU(s) for support of peer to peer memory access...
> Peer-to-Peer (P2P) access from Tesla K20c (GPU0) -> Tesla K20c (GPU1) : No
> Peer-to-Peer (P2P) access from Tesla K20c (GPU1) -> Tesla K20c (GPU0) : No
Two or more GPUs with SM 2.0 or higher capability are required for ./simpleP2P.
Peer to Peer access is not available between GPU0 <-> GPU1, waiving test.
我使用的设备如下:
$ lspci | grep NVIDIA
03:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
83:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
从nvidia-smi:获得的有关连接的其他信息
$ nvidia-smi topo -m
GPU0 GPU1 CPU Affinity
GPU0 X SOC 0-5,12-17
GPU1 SOC X 6-11,18-23
Legend:
X = Self
SOC = Path traverses a socket-level link (e.g. QPI)
PHB = Path traverses a PCIe host bridge
PXB = Path traverses multiple PCIe internal switches
PIX = Path traverses a PCIe internal switch
最后是lspci工具的详细输出。
03:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
Subsystem: NVIDIA Corporation Device 0982
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f9000000 (32-bit, non-prefetchable)
Memory at d0000000 (64-bit, prefetchable)
Memory at ce000000 (64-bit, prefetchable)
Capabilities: <access denied>
Kernel driver in use: nvidia
Kernel modules: nvidia_346, nouveau, nvidiafb
...
83:00.0 3D controller: NVIDIA Corporation GK110GL [Tesla K20c] (rev a1)
Subsystem: NVIDIA Corporation Device 0982
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at cc000000 (32-bit, non-prefetchable)
Memory at b0000000 (64-bit, prefetchable)
Memory at ae000000 (64-bit, prefetchable)
Capabilities: <access denied>
Kernel driver in use: nvidia
Kernel modules: nvidia_346, nouveau, nvidiafb
你们中的任何人都有一些信息可以帮助我排除故障,或者至少更好地了解问题在哪里?像往常一样感谢您的阅读/帮助。——Omar
当GPU通过套接字级链路互连时(QPI用于基于Intel的系统):
GPU0 X SOC 0-5,12-17
GPU1 SOC X 6-11,18-23
^^^
则P2P事务在这两个GPU之间是不可能的。
参与P2P的GPU有许多要求。其中之一是它们通常必须位于同一PCIE根复合体上。通过套接字级链路(例如QPI)连接的GPU位于两个不同的"套接字"上,即两个不同CPU,因此它们属于两个不同PCIE根复合体。
请注意,通常,P2P支持可能因GPU或GPU系列而异。在一种GPU类型或GPU系列上运行P2P的能力并不一定表明它将在另一种GPU型号或系列上工作,即使在相同的系统/设置中也是如此。GPU P2P支持的最终决定因素是提供的通过cudaDeviceCanAccessPeer
查询运行时的工具。P2P支持也可能因系统和其他因素而异。这里所做的任何声明都不能保证在任何特定设置中对任何特定GPU的P2P支持。