[1] NVIDIA 官方CUDA安装文档: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
[2] NVIDIA 对XFree86 下安装驱动的说明: http://us.download.nvidia.com/XFree86/Linux-x86/319.12/README/installdriver.html
[3] Ubuntu 官方编译内核教程: https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel
[4] Secure Boot: https://askubuntu.com/questions/755238/why-disabling-secure-boot-is-enforced-policy-when-installing-3rd-party-modules
ERROR: The kernel module failed to load, because it was not signed by a key
that is trusted by the kernel. Please try installing the driver again.
and sign the kernel when prompted to do so.
ERROR: Unable to load the kernel module 'nvidia.ko'. This happens most
frequently when this kernel module was built against the wrong or
improperly configured kernel sources, with a version of gcc that
differs from the one used to build the target kernel(1), or if a driver
such as rivafb, nvidiafb. or nouveau is present and prevents the
NVIDIA kernel module from obtaining ownership of the NVIDIA
graphics device(s), or no NVIDIA GPU installed in this system is
supported by this NVIDIA Linux graphics driver release.
Kernel module compilation complete.
The target kernel has CONFIG_MODULE_SIG set. which means that is supports
cryptographic signature on kernel modules. On some system, the kernel may refuse
to load modules without a valid signature from a trusted key. This system also has
UEFI Secure Boot enabled; many distrubtions enforce module signature verification
on UEFI systems when Secure Boot is enabled(2). Would you like sign the NVIDIA kernel
module? (Answer: Install without signing)
Kernel module load error: Required key not avaliable
上面错误已经粗体下划线突出显示并标出(1),(2).
检查系统Ubuntu 的Kernel 版本及其所编译使用的gcc版本:
$cat /proc/version
Linux version 4.4.0-116-generic (buildd@lgw01-amd64-021) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9) ) #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018
上面的输出结果对应于Ubuntu 16.06 版本. 可以看到gcc 的版本为5.4.0, 而在NVIDIA 官方cuda 安装文档[1] 中的requirement 如下
(为了突出重点,截去了部分), 而在系统始终保持更新的话,系统中的gcc版本应该就是5.4.0 版本,而NVIDIA 要求的却是 5.3.1. 但是根据经验还是没有问题的.
错误(2) 的简要描述了 NVIDIA 由于由于Ubuntu 16.04 的内核编译默认设置了 CONFIG_MODULE_SIG 为真, 然后Secure Boot打开所带来的问题, 更详细的描述见参考链接[2][3]. 大意是在支持UEFI的设备上打开Secure Boot 后,Ubuntu 16.04对于添加到内核的模块更加保守, 需要持有签名才能添加到模块中, 而显卡驱动由于要添加到内核中, 所以需要签名. 在安装过程中我们也会看到NVIDIA显卡会提示是否生成签名. 如果生成成功则没有问题,如果失败则
进入BIOS关闭Secure Boot
-------------------------------------------------------
以上是实践中的一些经验,欢迎讨论与批评.