没有可在设备上执行的内核映像致命的Python错误:中止



我想在这个repo中运行yolov4代码:https://github.com/hunglc007/tensorflow-yolov4-tflite我安装了python 3.7和所有要求以及cuda和cudn。根据日志,cudnn和cuda安装良好,但存在"错误";没有内核映像可用于在设备上执行";这个错误是什么?它与cuda或cudn版本错误有关吗?

Python:3.7.9,CUDA:10.1,Tensorflow:2.3.0rc0,Tensorflow GPU:未安装,CUDNN:7.5.0,OS:Windows10(x64(

py -3.7 save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4-416-tflite --input_size 416 --model yolov4 --framework tflite
2020-09-03 11:02:05.897607: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-03 11:02:09.504648: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-09-03 11:02:09.997508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 1.2415GHz coreCount: 3 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 13.41GiB/s
2020-09-03 11:02:10.017273: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-03 11:02:10.036505: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-03 11:02:10.059534: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-03 11:02:10.074749: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-03 11:02:10.094710: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-03 11:02:10.115167: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-03 11:02:10.140633: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-03 11:02:10.148636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-03 11:02:10.155846: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-09-03 11:02:10.188413: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x295adc030a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-03 11:02:10.199421: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-03 11:02:10.207675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 1.2415GHz coreCount: 3 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 13.41GiB/s
2020-09-03 11:02:10.222939: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-03 11:02:10.231890: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-03 11:02:10.241896: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-03 11:02:10.250393: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-03 11:02:10.260177: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-03 11:02:10.268644: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-03 11:02:10.278132: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-03 11:02:10.286635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-03 11:02:10.380510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-03 11:02:10.388703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0
2020-09-03 11:02:10.394562: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N
2020-09-03 11:02:10.402323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1464 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0)
2020-09-03 11:02:10.429701: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x295ae120140 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-03 11:02:10.441631: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce 940MX, Compute Capability 5.0
2020-09-03 11:02:10.619742: F .tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device
Fatal Python error: Aborted

错误表示tensorflow中使用的预构建二进制文件不支持实际硬件支持的SM版本(计算功能(。

您可以参考以下链接了解支持的组合:

https://www.tensorflow.org/install/source_windows#gpu

基于此,2.1.0和2.3.0都需要CUDNN 7.4和CUDA 10.1。您应该尝试使用这些支持的组合。

[2.3.0版本/rc2/rc0特定]https://github.com/tensorflow/tensorflow/releases/tag/v2.3.0-TF 2.3 includes PTX kernels only for compute capability 7.0 to reduce the TF pip binary size. Earlier releases included PTX for a variety of older compute capabilities.

最新更新