This page describes how to count and trace performance events in the KVM kernel module.
There are two tools, kvm_stat and kvm_trace, which were previously used for these tasks. Now this can be done using standard Linux tracing tools.
Often you want event counts after running a benchmark:
$ sudo mount -t debugfs none /sys/kernel/debug $ sudo ./perf stat -e 'kvm:*' -a sleep 1h ^C Performance counter stats for 'sleep 1h': 8330 kvm:kvm_entry # 0.000 M/sec 0 kvm:kvm_hypercall # 0.000 M/sec 4060 kvm:kvm_pio # 0.000 M/sec 0 kvm:kvm_cpuid # 0.000 M/sec 2681 kvm:kvm_apic # 0.000 M/sec 8343 kvm:kvm_exit # 0.000 M/sec 737 kvm:kvm_inj_virq # 0.000 M/sec 0 kvm:kvm_page_fault # 0.000 M/sec 0 kvm:kvm_msr # 0.000 M/sec 664 kvm:kvm_cr # 0.000 M/sec 872 kvm:kvm_pic_set_irq # 0.000 M/sec 0 kvm:kvm_apic_ipi # 0.000 M/sec 738 kvm:kvm_apic_accept_irq # 0.000 M/sec 874 kvm:kvm_set_irq # 0.000 M/sec 874 kvm:kvm_ioapic_set_irq # 0.000 M/sec 0 kvm:kvm_msi_set_irq # 0.000 M/sec 433 kvm:kvm_ack_irq # 0.000 M/sec 2685 kvm:kvm_mmio # 0.000 M/sec 3.493562100 seconds time elapsed
The perf tool is part of the Linux kernel tree in tools/perf.
Detailed traces can be generated using ftrace:
# mount -t debugfs none /sys/kernel/debug # echo 1 >/sys/kernel/debug/tracing/events/kvm/enable # cat /sys/kernel/debug/tracing/trace_pipe [...] kvm-5664 [000] 11906.220178: kvm_entry: vcpu 0 kvm-5664 [000] 11906.220181: kvm_exit: reason apic_access rip 0xc011518c kvm-5664 [000] 11906.220183: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0 kvm-5664 [000] 11906.220183: kvm_apic: apic_write APIC_EOI = 0x0 kvm-5664 [000] 11906.220184: kvm_ack_irq: irqchip IOAPIC pin 11 kvm-5664 [000] 11906.220185: kvm_entry: vcpu 0 kvm-5664 [000] 11906.220188: kvm_exit: reason io_instruction rip 0xc01e4473 kvm-5664 [000] 11906.220188: kvm_pio: pio_read at 0xc13e size 2 count 1 kvm-5664 [000] 11906.220193: kvm_entry: vcpu 0 ^D # echo 0 >/sys/kernel/debug/tracing/events/kvm/enable
Events can be recorded to a file for later reporting and analysis. You can record events for the host using the --host option. You can record events for a guest using the --guest option. You can use both options at the same time to record events for both the host and a guest.
If you use just the --host option the default output file will be perf.data.host. If you use just the --guest option the default output file will be perf.data.guest. If you use both options the default output file will be perf.data.kvm. Use the -o option after the record key word to save the output to a different file name.
# perf kvm --host --guest [kvm options] record -a -o my.perf.data
In order to record events for a guest, the perf tool needs the /proc/kallsyms and /proc/modules for the guest. These are passed to perf with the --guestkallsyms and --guestmodules options. The files will have to be on the host, but you can get them easily using ssh.
# ssh guest "cat /proc/kallsyms" > /tmp/guest.kallsyms # ssh guest "cat /proc/modules" > /tmp/guest.modules
It is better to use ssh to cat the files and redirect the output than to use scp. Experience has shown that scp guest:/proc/kallsyms /tmp/guest.kallsyms will return an empty file.
Using these options you can record the events for a host and a guest with the following command
# perf kvm --host --guest --guestkallsyms=/tmp/guest.kallsyms --guestmodules=/tmp/guest.modules record -a
The order of the arguments is important. In general, the syntax for using perf to profile kvm is: perf kvm <perf kvm args> <perf command> <perf command args> In this case the perf kvm arguments are --host --guest --guestkallsyms=/tmp/guest.kallsyms --guestmodules=/tmp/guest.modules, the perf command is record, and the perf command argument is -a (profile all processes).
perf will record events until it is terminated with SIGINT. It must be SIGINT. If perf is terminated with any other signal, such as SIGTERM or SIGQUIT, the pref kvm report command (see below) won't correctly process the file generated. It will just list the headers with no data.
perf kvm has a way of getting the guest's kallsyms and modules by itself instead of you hving to provide them. It makes use of sshfs to mount the root file system of the guest so that it can get the files directly from the guest.
sshfs depends on fuse, the Filesystem in Userspace. If your Linux distribution does not have fuse, see the fuse project page on SourceForge. If your Linux distribution does not have sshfs, see the sshfs project page on SourceForge.
Here's how it works. You create a directory, e.g., /tmp/guestmount. You then create a subdirectory that has for its name the PID of the qemu process for the guest. Then you use sshfs to mount the guest's root file system on that subdirectory. For example:
# mkdir -p /tmp/guestmount # ps -eo pid,cmd | grep qemu | grep -v grep 24764 /usr/libexec/qemu-kvm -M pc -m 4096 -smp 4 -name guest01 -boot c -drive file=/var/lib/libvirt/images/guest01.img ... # mkdir /tmp/guestmount/24764 # sshfs -o allow_other,direct_io guest:/ /tmp/guestmount/24764
Then, instead of using the --guestkallsyms and --guestmodules options, you use the --guestmount option.
# perf kvm --host --guest --guestmount=/tmp/guestmount
When you are finished recording, unmount the guest file system using fusermount.
# fusermount -u /tmp/guestmount/24764
Use the perf kvm report command to generate a report from a data file created by perf kvm record. To generate a report for the host, use the --host argument.
# perf kvm --host --guestmount=/tmp/guestmount report
To generate a report for the guest, use the --guest argument.
# perf kvm --guest --guestmount=/tmp/guestmount report
By default the perf report command will read perf.data.host for a host report and perf.data.guest for a guest report. Use the -i option if you want to use a different input file.
# perf kvm --host --guestmount=/tmp/guestmount report -i /tmp/perf.data.kvm
The perf report command sends its output to standard out. If you want the output to go to a file, redirect the output to a file. Don't use the -o option. Contrary to the help text, it does not work.