This page describes how to count and trace performance events in the KVM kernel module.
There are two tools, kvm_stat and kvm_trace, which were previously used for these tasks. Now this can be done using standard Linux tracing tools.
Contents[hide]
|
Often you want event counts after running a benchmark:
$ sudo mount -t debugfs none /sys/kernel/debug $ sudo ./perf stat -e 'kvm:*' -a sleep 1h ^C Performance counter stats for 'sleep 1h': 8330 kvm:kvm_entry # 0.000 M/sec 0 kvm:kvm_hypercall # 0.000 M/sec 4060 kvm:kvm_pio # 0.000 M/sec 0 kvm:kvm_cpuid # 0.000 M/sec 2681 kvm:kvm_apic # 0.000 M/sec 8343 kvm:kvm_exit # 0.000 M/sec 737 kvm:kvm_inj_virq # 0.000 M/sec 0 kvm:kvm_page_fault # 0.000 M/sec 0 kvm:kvm_msr # 0.000 M/sec 664 kvm:kvm_cr # 0.000 M/sec 872 kvm:kvm_pic_set_irq # 0.000 M/sec 0 kvm:kvm_apic_ipi # 0.000 M/sec 738 kvm:kvm_apic_accept_irq # 0.000 M/sec 874 kvm:kvm_set_irq # 0.000 M/sec 874 kvm:kvm_ioapic_set_irq # 0.000 M/sec 0 kvm:kvm_msi_set_irq # 0.000 M/sec 433 kvm:kvm_ack_irq # 0.000 M/sec 2685 kvm:kvm_mmio # 0.000 M/sec 3.493562100 seconds time elapsed
The perf tool is part of the Linux kernel tree in tools/perf.
Detailed traces can be generated using ftrace:
# mount -t debugfs none /sys/kernel/debug # echo 1 >/sys/kernel/debug/tracing/events/kvm/enable # cat /sys/kernel/debug/tracing/trace_pipe [...] kvm-5664 [000] 11906.220178: kvm_entry: vcpu 0 kvm-5664 [000] 11906.220181: kvm_exit: reason apic_access rip 0xc011518c kvm-5664 [000] 11906.220183: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0 kvm-5664 [000] 11906.220183: kvm_apic: apic_write APIC_EOI = 0x0 kvm-5664 [000] 11906.220184: kvm_ack_irq: irqchip IOAPIC pin 11 kvm-5664 [000] 11906.220185: kvm_entry: vcpu 0 kvm-5664 [000] 11906.220188: kvm_exit: reason io_instruction rip 0xc01e4473 kvm-5664 [000] 11906.220188: kvm_pio: pio_read at 0xc13e size 2 count 1 kvm-5664 [000] 11906.220193: kvm_entry: vcpu 0 ^D # echo 0 >/sys/kernel/debug/tracing/events/kvm/enable
Events can be recorded to a file for later reporting and analysis.You can record events for the host using the--host option.You can record events for a guest using the --guest option.You can use both options at the same time to record events for both the host and a guest.
If you use just the --host option the default output file will be perf.data.host.If you use just the --guest option the default output file will beperf.data.guest.If you use both options the default output file will be perf.data.kvm.Use the -o option after the record key word to save the output to a different file name.
# perf kvm --host --guest [kvm options] record -a -o my.perf.data
In order to record events for a guest, the perf tool needs the /proc/kallsyms and/proc/modules for the guest.These are passed to perf with the --guestkallsyms and --guestmodules options.The files will have to be on the host, but you can get them easily using ssh.
# ssh guest "cat /proc/kallsyms" > /tmp/guest.kallsyms # ssh guest "cat /proc/modules" > /tmp/guest.modules
It is better to use ssh to cat the files and redirect the output than to usescp.Experience has shown that scp guest:/proc/kallsyms /tmp/guest.kallsyms will return an empty file.
Using these options you can record the events for a host and a guest with the following command
# perf kvm --host --guest --guestkallsyms=/tmp/guest.kallsyms --guestmodules=/tmp/guest.modules record -a
The order of the arguments is important.In general, the syntax for using perf to profile kvm is:perf kvm <perf kvm args> <perf command> <perf command args>In this case the perf kvm arguments are--host --guest --guestkallsyms=/tmp/guest.kallsyms --guestmodules=/tmp/guest.modules, the perf command isrecord, and the perf command argument is -a (profile all processes).
perf will record events until it is terminated with SIGINT.It must be SIGINT.If perf is terminated with any other signal, such as SIGTERM or SIGQUIT, thepref kvm report command (see below) won't correctly process the file generated.It will just list the headers with no data.
perf kvm has a way of getting the guest's kallsyms and modules by itself instead of you hving to provide them.It makes use ofsshfs to mount the root file system of the guest so that it can get the files directly from the guest.
sshfs depends on fuse, the Filesystem in Userspace.If your Linux distribution does not havefuse, see the fuse project page on SourceForge.If your Linux distribution does not have sshfs, see the sshfs project page on SourceForge.
Here's how it works.You create a directory, e.g., /tmp/guestmount.You then create a subdirectory that has for its name the PID of the qemu process for the guest.Then you usesshfs to mount the guest's root file system on that subdirectory.For example:
# mkdir -p /tmp/guestmount # ps -eo pid,cmd | grep qemu | grep -v grep 24764 /usr/libexec/qemu-kvm -M pc -m 4096 -smp 4 -name guest01 -boot c -drive file=/var/lib/libvirt/images/guest01.img ... # mkdir /tmp/guestmount/24764 # sshfs -o allow_other,direct_io guest:/ /tmp/guestmount/24764
Then, instead of using the --guestkallsyms and --guestmodules options, you use the--guestmount option.
# perf kvm --host --guest --guestmount=/tmp/guestmount
When you are finished recording, unmount the guest file system using fusermount.
# fusermount -u /tmp/guestmount/24764
Use the perf kvm report command to generate a report from a data file created byperf kvm record.To generate a report for the host, use the --host argument.
# perf kvm --host --guestmount=/tmp/guestmount report
To generate a report for the guest, use the --guest argument.
# perf kvm --guest --guestmount=/tmp/guestmount report
By default the perf report command will read perf.data.host for a host report andperf.data.guest for a guest report.Use the -i option if you want to use a different input file.
# perf kvm --host --guestmount=/tmp/guestmount report -i /tmp/perf.data.kvm
The perf report command sends its output to standard out. If you want the output to go to a file, redirect the output to a file.Don't use the-o option.Contrary to the help text, it does not work.
This tool is very like xenoprof(if i remember correctly), and traces kvm events smartly. currently, it supports vmexit/mmio/ioport events. Usage: - to trace kvm events: # ./perf kvm-events record - show the result # ./perf kvm-events report Some output are as follow: # ./perf kvm-events report Warning: Error: expected type 5 but read 4 Warning: Error: expected type 5 but read 0 Warning: unknown op '}' Analyze events for all VCPUs: VM-EXIT Samples Samples% Time% Avg time APIC_ACCESS 438107 44.89% 6.20% 17.91us EXTERNAL_INTERRUPT 219226 22.46% 8.01% 46.20us IO_INSTRUCTION 122651 12.57% 1.88% 19.44us EPT_VIOLATION 83110 8.52% 1.36% 20.75us PENDING_INTERRUPT 37055 3.80% 0.16% 5.38us CPUID 32718 3.35% 0.08% 3.15us EXCEPTION_NMI 23601 2.42% 0.17% 8.87us HLT 15424 1.58% 82.12% 6735.06us CR_ACCESS 4089 0.42% 0.02% 6.08us Total Samples:975981, Total events handled time:126502464.88us. The default event to be analysed is vmexit, we can use --event to specify it, for example, if we want to trace mmio event: # ./perf kvm-events report --event mmio Warning: Error: expected type 5 but read 4 Warning: Error: expected type 5 but read 0 Warning: unknown op '}' Analyze events for all VCPUs: MMIO Access Samples Samples% Time% Avg time 0xfee00380:W 196589 64.95% 70.01% 3.83us 0xfee00310:W 35356 11.68% 6.48% 1.97us 0xfee00300:W 35356 11.68% 16.37% 4.97us 0xfee00300:R 35356 11.68% 7.14% 2.17us Total Samples:302657, Total events handled time:1074746.01us. We can use --vcpu to specify which vcpu is traced: root@localhost perf]# ./perf kvm-events report --event mmio --vcpu 1 Warning: Error: expected type 5 but read 4 Warning: Error: expected type 5 but read 0 Warning: unknown op '}' Analyze events for VCPU 1: MMIO Access Samples Samples% Time% Avg time 0xfee00380:W 58041 71.20% 74.90% 3.70us 0xfee00310:W 7826 9.60% 5.28% 1.93us 0xfee00300:W 7826 9.60% 13.82% 5.06us 0xfee00300:R 7826 9.60% 6.01% 2.20us Total Samples:81519, Total events handled time:286577.81us. And, '--key' is used to sort the result, the possible value sample (default, the result is sorted by samples number), time(the result is sorted by time%): # ./perf kvm-events report --key time Warning: Error: expected type 5 but read 4 Warning: Error: expected type 5 but read 0 Warning: unknown op '}' Analyze events for all VCPUs: VM-EXIT Samples Samples% Time% Avg time HLT 15424 1.58% 82.12% 6735.06us EXTERNAL_INTERRUPT 219226 22.46% 8.01% 46.20us APIC_ACCESS 438107 44.89% 6.20% 17.91us IO_INSTRUCTION 122651 12.57% 1.88% 19.44us EPT_VIOLATION 83110 8.52% 1.36% 20.75us EXCEPTION_NMI 23601 2.42% 0.17% 8.87us PENDING_INTERRUPT 37055 3.80% 0.16% 5.38us CPUID 32718 3.35% 0.08% 3.15us CR_ACCESS 4089 0.42% 0.02% 6.08us Total Samples:975981, Total events handled time:126502464.88us. I hope guys will like it and any comments are welcome! :)