我的KVM显卡直通实践

·为什么放弃使用VMwareESXi

因为需要两台电脑,用其中一台对另一台进行直通的管理,这太麻烦了。


前期准备:

安装KVM

#apt-get install qemu-kvm qemu virt-manager virt-viewer libvirt-bin python-libvirt bridge-utils


准备直通的显卡编号:

pci_0000_04_00_0

pci_0000_04_00_1

0000:04:00.0VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Quadro M4000] [10de:13f1] (rev a1)

0000:04:00.1Audio device [0403]: NVIDIA Corporation GM204 High Definition Audio Controller [10de:0fbb] (rev a1)


方案1PCIPass-throughlibvirt

来源:http://www.it165.net/admin/html/201506/5722.html

1.BIOS中打开IntelVT-d(成功)

2.Linux内核中方启用PCIPass-through(成功:sudogedit \etc\default\grub

3.重启系统,使配置生效(成功)

4.使用lspci-nn命令找到待分配的PCI设备(成功)

5.使用virshnodedev-list命令找到设备的PCI编号(成功)

6.使用virshnodedev-dettach命令将设备从主机上移除(失败,系统直接卡住,或者终端卡住,多次实验均没有效果)

7.使用virt-manager将设备直接分配给一个启动了的虚拟机(试图跳过第六步直接执行词步,失败,KVM直接卡住)


NodeDevice (help keyword 'nodedev'):

nodedev-create create a device defined by an XML file on the node

nodedev-destroy destroy (stop) a device on the node

nodedev-detach detach node device from its device driver

nodedev-dumpxml node device details in XML

nodedev-list enumerate devices on this host

nodedev-reattach reattach node device to its device driver

nodedev-reset reset node device


方案2PCIPass-throughqemu

来源:http://blog.csdn.net/halcyonbaby/article/details/37776211

1.与方案11~5相同(成功)

2.

modprobe pci_stub

(成功)

3.
echo "10de 13f1" > /sys/bus/pci/drivers/pci-stub/new_id
(成功)
4.
echo 0000:04:00.0 > /sys/bus/pci/devices/0000:04:00.0/driver/unbind
(失败,系统直接卡住,多次试验结果均相同)
5.
echo 0000:04:00.0 > /sys/bus/pci/drivers/pci-stub/bind
(失败: bash:echo: write error: No such device

方案3PCIPass-throughVFIO

来源:http://blog.csdn.net/halcyonbaby/article/details/37776211

1.安装kernelmodule(系统已经默认安装)

2.

sudo modprobe vfio
(成功)

3.

sudo modprobe vfio-pci
(成功)

4.

cd /sys/bus/pci/devices/0000:04:00.0/
(成功)

5.

readlink iommu_group
(成功: ../../../../kernel/iommu_groups/11

6.

ll iommu_group/devices
(成功:

total0

drwxr-xr-x2 root root 0 51610:22 ./

drwxr-xr-x3 root root 0 51610:06 ../

lrwxrwxrwx1 root root 0 51610:22 0000:04:00.0 ->../../../../devices/pci0000:00/0000:00:1c.4/0000:04:00.0/

lrwxrwxrwx1 root root 0 51610:22 0000:04:00.1 ->../../../../devices/pci0000:00/0000:00:1c.4/0000:04:00.1/

7.

echo 0000:04:00.0 > /sys/bus/pci/devices/0000:04:00.0/driver/unbind
(与方案 2 第四步相同,但是居然成功了

8.

echo 10de 13f1 > /sys/bus/pci/drivers/vfio-pci/new_id
(失败,终端卡住)

9.将第八步改为echo"10de 13f1" > /sys/bus/pci/drivers/vfio-pci/new_id(失败,终端卡住)

10.尝试使用nano命令改变该文件内容(失败,终多端卡住)

11.尝试先chmod744/777再用echo改变文件内容(失败,终端卡住)

12.重启之后又试了一次(终端依然卡住)

13.尝试sh-c "echo 10de 13f1 > /sys/bus/pci/drivers/vfio-pci/new_id"(失败)

14.由于7.成功了,尝试转向方案2,依然失败

执行第七步之后/sys/bus/pci/devices/0000:04:00.0/下的文件有明显改变,可能会对之后的操作方案有影响


方案4GPUPassthrough via vfio-pci with KVM on Ubuntu 15.04

1.Install Ubuntu, I'm using 15.04 Server. You should also update your kernel tov4.0.0.(我的:ubuntu14.04,内核版本4.4.0

2.

sudo apt-get install qemu-kvm
(完成)

3.in/etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="nomodeset intel_iommu=on pcie_acs_override=downstream pci=assign-bussesigb.max_vfs=2"
(成功)

Notice:"pci=assign-busses igb.max_vfs=2" were for my i350 SR-IOV.You can ignore them if you do not have SR-IOV in mind.

补充:这里应该再

sudo update-gru

4.Inetc/modprobe.d/local.conf(new file): (成功)

optionsvfio-pciids=1002:ffffffff:ffffffff:ffffffff:00030000:ffff00ff,1002:ffffffff:ffffffff:ffffffff:00040300:ffffffff,10de:ffffffff:ffffffff:ffffffff:00030000:ffff00ff,10de:ffffffff:ffffffff:ffffffff:00040300:ffffffff

补充:我使用的命令是sudo visudo gedit

5.Reboot(成功)

然而此方案后面语焉不详,所以暂时放弃了


方案5:重新尝试之前的方案

研究了一下显卡驱动的问题,发现我的显卡驱动没安好。

默默安好了显卡驱动。

尝试方案1:卡死在第6步。

尝试方案2:系统卡死在之前成功了的第3步。重启后又尝试了一下,第3/4步成功。最后一步卡死。

尝试方案3:第7步卡死


方案6GPU Passthrough, VGA Passthrough in KVM

来源:https://blog.lofyer.org/pass-host-gpu-to-guest-via-qemu-ncursescurses/

1.Enable the mainboard VxT, iommu and alter the video device to Intel HD(完成)

2.Modify the kernel parameter,morprobe.d and libvirt.conf

Add follow parameters to grub.conf:

 intel_iommu=onpci-stub.ids=1002:6819,1002:aab0,vfio_iommu_type1.allow_unsafe_interrupts=1
(成功)

3.Addmodprobe.conf to /etc/modprobe.d/ with this content:(成功)

blacklistradeon
optionskvm ignore_msrs=1
optionskvm allow_unsafe_interrupts=1
optionskvm-amd npt=0
optionskvm_intel emulate_invalid_guest_state=0
optionsvfio_iommu_type1 allow_unsafe_interrupts=1

4.changethe following options in /etc/libvirt/qemu.conf:(成功)

#The user ID for QEMU processes run by the system instance.
user= "root"

#The group ID for QEMU processes run by the system instance.
group= "root"

......

#If clear_emulator_capabilities is enabled, libvirt will drop all
#privileged capabilities of the QEmu/KVM emulator. This is enabled by
#default.
#
#Warning: Disabling this option means that a compromised guest can
#exploit the privileges and possibly do damage to the host.
#
clear_emulator_capabilities= 0

5.File:vfio-bind: (成功)

	#!/bin/bash 
	modprobe vfio-pci 
	for var in "$@"; do 
	        for dev in $(ls /sys/bus/pci/devices/$var/iommu_group/devices); do 
	                vendor=$(cat /sys/bus/pci/devices/$dev/vendor) 
	                device=$(cat /sys/bus/pci/devices/$dev/device) 
	                if [ -e /sys/bus/pci/devices/$dev/driver ]; then 
	                        echo $dev > /sys/bus/pci/devices/$dev/driver/unbind 
	                        fi 
	                echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id 
	        done 
	done

6.执行./vfio-bind0000:04:00.0 0000:04:00.1 (失败,终端卡住)

(注:这里我先用了chmod+x才执行了该文件)

总结一下这个方案,不过是之前的方案三换汤不换药。


方案7:修改虚拟机的xml文件

来源:http://www.server110.com/kvm/201402/5547.html

sudovirt-install --name=one --ram 2048 --vcpus=4 --diskpath=/root/one.img,size=40 --accelerate --cdrom/home/ubuntu/Downloads/111/cn_windows_7_ultimate_with_sp1_x64_dvd_u_677408.iso

(成功)

virsh edit One

在xml文件内添加:

     
       
      

(成功)


Failed.Try again? [y,n,f,?]:

error:(domain_definition):69: Opening and ending tag mismatch: hostdev line68 and devices

(已解决)


Failed.Try again? [y,n,f,?]:

error:XML error: Missing element in hostdev device(已解决)


Domain One XML configuration edited.(成功)


virsh undefine One

virsh define One.xml

error:Failed to open file 'One.xml': No such file or directory


方案8:首先编译kernel

来源:http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM

……正在折腾中,待更新。


方案9:询问百度知道

来源:https://zhidao.baidu.com/question/1823842477872037748.html

在百度知道里面花了点财富值提问了一下我的问题,得到的答案是:

这个现象的原因一般有三个:1、硬件超温过热。2、系统某些基层文件出问题。3、内存松动。解决办法:1、清尘,检查散热风扇。2、重装系统。3、重新插拔内存并擦拭金手指部分。

然而这个电脑是个新电脑,不存在灰尘过多和内存松动的问题。(虽然如此,我还是把显卡和内存条都插拔并且清理了一遍)

重装系统也尝试了。

最终这三个方法都没有解决我的问题。


方案10: 检查显卡驱动

这个方案得到了成功。

具体步骤见在ubuntu 14.04下的KVM虚拟机显卡直通。



/本文第一次编辑于:2017/5/1

/本文第二次编辑于:2017/5/31

/本文第三次编辑于:2017/6/14

你可能感兴趣的:(我的KVM显卡直通实践)