kvm trace performance events

This page describes how to count and trace performance events in the KVM kernel module.

There are two tools, kvm_stat and kvm_trace, which were previously used for these tasks. Now this can be done using standard Linux tracing tools.

Contents

[hide]
  • 1Counting events
  • 2Tracing events
  • 3Recording events
    • 3.1Recording events for a guest
      • 3.1.1Using copies of guest files
      • 3.1.2Using sshfs
  • 4Reporting events
  • 5Links

Counting events

Often you want event counts after running a benchmark:

$ sudo mount -t debugfs none /sys/kernel/debug
$ sudo ./perf stat -e 'kvm:*' -a sleep 1h
^C
 Performance counter stats for 'sleep 1h':

           8330  kvm:kvm_entry            #      0.000 M/sec
              0  kvm:kvm_hypercall        #      0.000 M/sec
           4060  kvm:kvm_pio              #      0.000 M/sec
              0  kvm:kvm_cpuid            #      0.000 M/sec
           2681  kvm:kvm_apic             #      0.000 M/sec
           8343  kvm:kvm_exit             #      0.000 M/sec
            737  kvm:kvm_inj_virq         #      0.000 M/sec
              0  kvm:kvm_page_fault       #      0.000 M/sec
              0  kvm:kvm_msr              #      0.000 M/sec
            664  kvm:kvm_cr               #      0.000 M/sec
            872  kvm:kvm_pic_set_irq      #      0.000 M/sec
              0  kvm:kvm_apic_ipi         #      0.000 M/sec
            738  kvm:kvm_apic_accept_irq  #      0.000 M/sec
            874  kvm:kvm_set_irq          #      0.000 M/sec
            874  kvm:kvm_ioapic_set_irq   #      0.000 M/sec
              0  kvm:kvm_msi_set_irq      #      0.000 M/sec
            433  kvm:kvm_ack_irq          #      0.000 M/sec
           2685  kvm:kvm_mmio             #      0.000 M/sec

    3.493562100  seconds time elapsed

The perf tool is part of the Linux kernel tree in tools/perf.

Tracing events

Detailed traces can be generated using ftrace:

# mount -t debugfs none /sys/kernel/debug
# echo 1 >/sys/kernel/debug/tracing/events/kvm/enable
# cat /sys/kernel/debug/tracing/trace_pipe
[...]
             kvm-5664  [000] 11906.220178: kvm_entry: vcpu 0
             kvm-5664  [000] 11906.220181: kvm_exit: reason apic_access rip 0xc011518c
             kvm-5664  [000] 11906.220183: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0
             kvm-5664  [000] 11906.220183: kvm_apic: apic_write APIC_EOI = 0x0
             kvm-5664  [000] 11906.220184: kvm_ack_irq: irqchip IOAPIC pin 11
             kvm-5664  [000] 11906.220185: kvm_entry: vcpu 0
             kvm-5664  [000] 11906.220188: kvm_exit: reason io_instruction rip 0xc01e4473
             kvm-5664  [000] 11906.220188: kvm_pio: pio_read at 0xc13e size 2 count 1
             kvm-5664  [000] 11906.220193: kvm_entry: vcpu 0
^D
# echo 0 >/sys/kernel/debug/tracing/events/kvm/enable

Recording events

Events can be recorded to a file for later reporting and analysis.You can record events for the host using the--host option.You can record events for a guest using the --guest option.You can use both options at the same time to record events for both the host and a guest.

If you use just the --host option the default output file will be perf.data.host.If you use just the --guest option the default output file will beperf.data.guest.If you use both options the default output file will be perf.data.kvm.Use the -o option after the record key word to save the output to a different file name.

# perf kvm --host --guest [kvm options] record -a -o my.perf.data

Recording events for a guest

Using copies of guest files

In order to record events for a guest, the perf tool needs the /proc/kallsyms and/proc/modules for the guest.These are passed to perf with the --guestkallsyms and --guestmodules options.The files will have to be on the host, but you can get them easily using ssh.

# ssh guest "cat /proc/kallsyms" > /tmp/guest.kallsyms
# ssh guest "cat /proc/modules" > /tmp/guest.modules

It is better to use ssh to cat the files and redirect the output than to usescp.Experience has shown that scp guest:/proc/kallsyms /tmp/guest.kallsyms will return an empty file.

Using these options you can record the events for a host and a guest with the following command

# perf kvm --host --guest --guestkallsyms=/tmp/guest.kallsyms --guestmodules=/tmp/guest.modules record -a

The order of the arguments is important.In general, the syntax for using perf to profile kvm is:perf kvm <perf kvm args> <perf command> <perf command args>In this case the perf kvm arguments are--host --guest --guestkallsyms=/tmp/guest.kallsyms --guestmodules=/tmp/guest.modules, the perf command isrecord, and the perf command argument is -a (profile all processes).

perf will record events until it is terminated with SIGINT.It must be SIGINT.If perf is terminated with any other signal, such as SIGTERM or SIGQUIT, thepref kvm report command (see below) won't correctly process the file generated.It will just list the headers with no data.

Using sshfs

perf kvm has a way of getting the guest's kallsyms and modules by itself instead of you hving to provide them.It makes use ofsshfs to mount the root file system of the guest so that it can get the files directly from the guest.

sshfs depends on fuse, the Filesystem in Userspace.If your Linux distribution does not havefuse, see the fuse project page on SourceForge.If your Linux distribution does not have sshfs, see the sshfs project page on SourceForge.

Here's how it works.You create a directory, e.g., /tmp/guestmount.You then create a subdirectory that has for its name the PID of the qemu process for the guest.Then you usesshfs to mount the guest's root file system on that subdirectory.For example:

# mkdir -p /tmp/guestmount
# ps -eo pid,cmd | grep qemu | grep -v grep
24764 /usr/libexec/qemu-kvm -M pc -m 4096 -smp 4 -name guest01 -boot c -drive file=/var/lib/libvirt/images/guest01.img ...
# mkdir /tmp/guestmount/24764
# sshfs -o allow_other,direct_io guest:/ /tmp/guestmount/24764

Then, instead of using the --guestkallsyms and --guestmodules options, you use the--guestmount option.

# perf kvm --host --guest --guestmount=/tmp/guestmount

When you are finished recording, unmount the guest file system using fusermount.

# fusermount -u /tmp/guestmount/24764

Reporting events

Use the perf kvm report command to generate a report from a data file created byperf kvm record.To generate a report for the host, use the --host argument.

# perf kvm --host --guestmount=/tmp/guestmount report

To generate a report for the guest, use the --guest argument.

# perf kvm --guest --guestmount=/tmp/guestmount report

By default the perf report command will read perf.data.host for a host report andperf.data.guest for a guest report.Use the -i option if you want to use a different input file.

# perf kvm --host --guestmount=/tmp/guestmount report -i /tmp/perf.data.kvm

The perf report command sends its output to standard out. If you want the output to go to a file, redirect the output to a file.Don't use the-o option.Contrary to the help text, it does not work.

Links

  • ftrace.txt
  • events.txt
  • tracepoint-analysis.txt


This tool is very like xenoprof(if i remember correctly), and traces kvm events
smartly. currently, it supports vmexit/mmio/ioport events.

Usage:
- to trace kvm events:
# ./perf kvm-events record

- show the result
# ./perf kvm-events report

Some output are as follow:
# ./perf kvm-events report
  Warning: Error: expected type 5 but read 4
  Warning: Error: expected type 5 but read 0
  Warning: unknown op '}'


Analyze events for all VCPUs:

             VM-EXIT    Samples      Samples%         Time%        Avg time

         APIC_ACCESS     438107        44.89%         6.20%        17.91us
  EXTERNAL_INTERRUPT     219226        22.46%         8.01%        46.20us
      IO_INSTRUCTION     122651        12.57%         1.88%        19.44us
       EPT_VIOLATION      83110         8.52%         1.36%        20.75us
   PENDING_INTERRUPT      37055         3.80%         0.16%         5.38us
               CPUID      32718         3.35%         0.08%         3.15us
       EXCEPTION_NMI      23601         2.42%         0.17%         8.87us
                 HLT      15424         1.58%        82.12%      6735.06us
           CR_ACCESS       4089         0.42%         0.02%         6.08us

Total Samples:975981, Total events handled time:126502464.88us.

The default event to be analysed is vmexit, we can use --event to specify it,
for example, if we want to trace mmio event:
# ./perf kvm-events report --event mmio
  Warning: Error: expected type 5 but read 4
  Warning: Error: expected type 5 but read 0
  Warning: unknown op '}'


Analyze events for all VCPUs:

         MMIO Access    Samples      Samples%         Time%        Avg time

        0xfee00380:W     196589        64.95%        70.01%         3.83us
        0xfee00310:W      35356        11.68%         6.48%         1.97us
        0xfee00300:W      35356        11.68%        16.37%         4.97us
        0xfee00300:R      35356        11.68%         7.14%         2.17us

Total Samples:302657, Total events handled time:1074746.01us.

We can use --vcpu to specify which vcpu is traced:
root@localhost perf]# ./perf kvm-events report --event mmio --vcpu 1
  Warning: Error: expected type 5 but read 4
  Warning: Error: expected type 5 but read 0
  Warning: unknown op '}'


Analyze events for VCPU 1:

         MMIO Access    Samples      Samples%         Time%        Avg time

        0xfee00380:W      58041        71.20%        74.90%         3.70us
        0xfee00310:W       7826         9.60%         5.28%         1.93us
        0xfee00300:W       7826         9.60%        13.82%         5.06us
        0xfee00300:R       7826         9.60%         6.01%         2.20us

Total Samples:81519, Total events handled time:286577.81us.

And, '--key' is used to sort the result, the possible value sample (default,
the result is sorted by samples number), time(the result is sorted by time%):
# ./perf kvm-events report --key time
  Warning: Error: expected type 5 but read 4
  Warning: Error: expected type 5 but read 0
  Warning: unknown op '}'


Analyze events for all VCPUs:

             VM-EXIT    Samples      Samples%         Time%        Avg time

                 HLT      15424         1.58%        82.12%      6735.06us
  EXTERNAL_INTERRUPT     219226        22.46%         8.01%        46.20us
         APIC_ACCESS     438107        44.89%         6.20%        17.91us
      IO_INSTRUCTION     122651        12.57%         1.88%        19.44us
       EPT_VIOLATION      83110         8.52%         1.36%        20.75us
       EXCEPTION_NMI      23601         2.42%         0.17%         8.87us
   PENDING_INTERRUPT      37055         3.80%         0.16%         5.38us
               CPUID      32718         3.35%         0.08%         3.15us
           CR_ACCESS       4089         0.42%         0.02%         6.08us

Total Samples:975981, Total events handled time:126502464.88us.

I hope guys will like it and any comments are welcome! :)


你可能感兴趣的:(kvm trace performance events)