在前面的文章里面有说过,我在qemu的源码根目录建了个新路径专门来作为分析源码和debug之用。
好了,现在我们打开这个新路径:qemu/bin/debug/native
看过之前文章 关于qemu的二三事(4)————qemu源码的下载与编译,以及fdt 就知道,我再这个路径之下编译了qemu的源码。本来空空如也的文件夹,现在里面已经被填塞了一堆东西:
[root@localhost qemu]# ls bin/debug/native/
chardev hmp.d pc-bios qemu-bridge-helper qmp-commands.h tests
accel.d config-all-devices.mak hmp.o po qemu-bridge-helper.d qmp.d tpm.d
accel.o config-all-disas.mak hw qapi qemu-bridge-helper.o qmp-introspect.c tpm.o
audio config-host.h io qapi-event.c qemu-ga qmp-introspect.d trace
backends config-host.h-timestamp iothread.d qapi-event.d qemu-img qmp-introspect.h trace-events-all
block config-host.mak iothread.o qapi-event.h qemu-img-cmds.h qmp-introspect.o trace-root.c
block.d config.log ivshmem-client qapi-event.o qemu-img.d qmp-marshal.c trace-root.c-timestamp
blockdev.d config.status ivshmem-server qapi-generated qemu-img.o qmp-marshal.d trace-root.d
blockdev-nbd.d contrib libqemustub.a qapi-types.c qemu-io qmp-marshal.o trace-root.h
blockdev-nbd.o cpus-common.d libqemuutil.a qapi-types.d qemu-io-cmds.d qmp.o trace-root.h-timestamp
blockdev.o cpus-common.o linux-headers qapi-types.h qemu-io-cmds.o qobject trace-root.o
blockjob.d crypto linux-user qapi-types.o qemu-io.d qom ui
blockjob.o device-hotplug.d Makefile qapi-visit.c qemu-io.o replay util
block.o device-hotplug.o migration qapi-visit.d qemu-nbd replication.d vl.d
disas module_block.h qapi-visit.h qemu-nbd.d replication.o vl.o
bt-host.d dma-helpers.d nbd qapi-visit.o qemu-nbd.o roms x86_64-softmmu
bt-host.o dma-helpers.o net qdev-monitor.d qemu-options.def slirp x86_64-softmmu-config-devices.mak.d
bt-vhci.d docs os-posix.d qdev-monitor.o qemu-version.h stubs
bt-vhci.o fsdev os-posix.o qdict-test-data.txt qga target
[root@localhost qemu]#
首先看到的就是vl.c文件里面的main函数,main函数是在2971行:
2968 return 0;
2969 }
2970
2971 int main(int argc, char **argv, char **envp)
2972 {
2973 int i;
2974 int snapshot, linux_boot;
2975 const char *initrd_filename;
2976 const char *kernel_filename, *kernel_cmdline;
2977 const char *boot_order = NULL;
2978 const char *boot_once = NULL;
2979 DisplayState *ds;
2980 int cyls, heads, secs, translation;
2981 QemuOpts *opts, *machine_opts;
2982 QemuOpts *hda_opts = NULL, *icount_opts = NULL, *accel_opts = NULL;
2983 QemuOptsList *olist;
... ...
gdb来调试首先要干的事是什么?
打断点啊!
断点打在哪里是门学问。合理的设置断点有助于提高程序调试的效率和速度,闲话少说,我们的第一个断点该设在哪里?
把vl.c里面main函数里面的内容大致过一遍,发现前面很大篇幅都是一些变量、数组、结构体的初始化、一些函数的注册,参数的解析,一直到4082行总算初步参数解析完了,这里只是初步,因为带有子选项的参数还没解析好,或者是说还没有做更进一步的处理,举个简单例子就是后面4201行的smp的参数解析和处理:
4201 smp_parse(qemu_opts_find(qemu_find_opts("smp-opts"), NULL));
4202
4203 machine_class->max_cpus = machine_class->max_cpus ?: 1; /* Default to UP */
4204 if (max_cpus > machine_class->max_cpus) {
4205 error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
4206 "supported by machine '%s' (%d)", max_cpus,
4207 machine_class->name, machine_class->max_cpus);
4208 exit(1);
继续往下看,到4400行左右的时候,我们能看到一些关于guest OS的boot相关的代码:
4399 machine_opts = qemu_get_machine_opts();
4400 kernel_filename = qemu_opt_get(machine_opts, "kernel");
4401 initrd_filename = qemu_opt_get(machine_opts, "initrd");
4402 kernel_cmdline = qemu_opt_get(machine_opts, "append");
4403 bios_name = qemu_opt_get(machine_opts, "firmware");
4404
4405 opts = qemu_opts_find(qemu_find_opts("boot-opts"), NULL);
4406 if (opts) {
4407 boot_order = qemu_opt_get(opts, "order");
4408 if (boot_order) {
4409 validate_bootdevices(boot_order, &error_fatal);
4410 }
... ....
这说明离我们要找的东西已经不远了,继续。看到4587行附近发现guest OS的初始化基本完成,要开始创建了,后边几行就是一些硬件设备的初始化了:
4587 current_machine->ram_size = ram_size;
4588 current_machine->maxram_size = maxram_size;
4589 current_machine->ram_slots = ram_slots;
4590 current_machine->boot_order = boot_order;
4591 current_machine->cpu_model = cpu_model;
4592
4593 machine_run_board_init(current_machine);
4594
4595 realtime_init();
4596
... ...
再到4701行的:
4700
4701 qdev_machine_creation_done();
基本可以断定vcpu的创建和初始化就在4593行的展开里面。这里打上断点。好了gdb正式搞起。
现在在gdb里面跑一个最简单的命令:
r --enable-kvm -smp 2 -m 2048M -cpu host -hda /root/test/rhel7.qcow -monitor stdio
这时候直接C走到断点,进入断点:
4593 machine_run_board_init(current_machine);
进去之后是这样的:
736 void machine_run_board_init(MachineState *machine)
737 {
738 MachineClass *machine_class = MACHINE_GET_CLASS(machine);
739
740 if (nb_numa_nodes) {
741 machine_numa_validate(machine);
742 }
743 machine_class->init(machine);
744 }
machine_class->init(machine);
它里面,看到:
(gdb) s
pc_init_v2_10 (machine=0x555556720000) at /root/qemu-2017-0531/qemu/hw/i386/pc_piix.c:449
449 DEFINE_I440FX_MACHINE(v2_10, "pc-i440fx-2.10", NULL,
(gdb) l
444 m->alias = "pc";
445 m->is_default = 1;
446 m->numa_auto_assign_ram = numa_legacy_auto_assign_ram;
447 }
448
449 DEFINE_I440FX_MACHINE(v2_10, "pc-i440fx-2.10", NULL,
450 pc_i440fx_2_10_machine_options);
451
我们再去看看这个449行的宏是个什么东西:
419
420 #define DEFINE_I440FX_MACHINE(suffix, name, compatfn, optionfn) \
421 static void pc_init_##suffix(MachineState *machine) \
422 { \
423 void (*compat)(MachineState *m) = (compatfn); \
424 if (compat) { \
425 compat(machine); \
426 } \
427 pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
428 TYPE_I440FX_PCI_DEVICE); \
429 } \
430 DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
431
1137
1138 void pc_cpus_init(PCMachineState *pcms)
1139 {
1140 int i;
1141 CPUClass *cc;
1142 ObjectClass *oc;
1143 const char *typename;
1144 gchar **model_pieces;
1145 const CPUArchIdList *possible_cpus;
1146 MachineState *machine = MACHINE(pcms);
1147 MachineClass *mc = MACHINE_GET_CLASS(pcms);
1148
1149 /* init CPUs */
1150 if (machine->cpu_model == NULL) {
1151 #ifdef TARGET_X86_64
1152 machine->cpu_model = "qemu64";
... ...
...
...
1182 possible_cpus = mc->possible_cpu_arch_ids(machine);
1183 for (i = 0; i < smp_cpus; i++) {
1184 pc_new_cpu(typename, possible_cpus->cpus[i].arch_id, &error_fatal);
1185 }
我们想找的东西,他就在这个函数pc_new_cpu里。在此之前的都是一些关于vcpu的参数配置啊类型啊什么的,gdb进去1184行这里,我们可以看到:
pc_new_cpu (typename=0x555556699690 "qemu64-x86_64-cpu", apic_id=0, errp=0x555556683790 ) at /root/qemu-2017-0531/qemu/hw/i386/pc.c:1097
1096 static void pc_new_cpu(const char *typename, int64_t apic_id, Error **errp)
1097 {
1098 Object *cpu = NULL;
1099 Error *local_err = NULL;
1100
1101 cpu = object_new(typename);
(gdb)
1102
1103 object_property_set_int(cpu, apic_id, "apic-id", &local_err);
1104 object_property_set_bool(cpu, true, "realized", &local_err);
1105
1106 object_unref(cpu);
1107 error_propagate(errp, local_err);
1108 }
1109
继续gdb单步,我们发现1103行执行之后没啥变化,但是1104行执行之后,会有新的线程产生,考虑到qemu本身就是一个userspace的程序,与kvm的交互实际上是通过接口kvm_ioctrl来读写/dev/kvm来实现的,那么qemu启动的虚拟机实际上就是一个进程,而vcpu则是这个进程下面的子线程。
那么,我们有理由认为,vcpu的创建与初始化是在第1104行完成的。继续gdb进去,
object_property_set_bool (obj=0x555556779580, value=true, name=0x555555c453d0 "realized", errp=0x7fffffffdd68) at /root/qemu-2017-0531/qemu/qom/object.c:1162
1162 QBool *qbool = qbool_from_bool(value);
1158
1159 void object_property_set_bool(Object *obj, bool value,
1160 const char *name, Error **errp)
1161 {
1162 QBool *qbool = qbool_from_bool(value);
1163 object_property_set_qobject(obj, QOBJECT(qbool), name, errp);
1164
1165 QDECREF(qbool);
1166 }
继续gdb进入1163,
object_property_set_qobject (obj=0x555556779580, value=0x555556794a10, name=0x555555c453d0 "realized", errp=0x7fffffffdd68)
at /root/qemu-2017-0531/qemu/qom/qom-qobject.c:26
26 v = qobject_input_visitor_new(value);
(gdb) l
21 void object_property_set_qobject(Object *obj, QObject *value,
22 const char *name, Error **errp)
23 {
24 Visitor *v;
25
26 v = qobject_input_visitor_new(value);
27 object_property_set(obj, v, name, errp);
28 visit_free(v);
29 }
继续到第27,
object_property_set (obj=0x555556779580, v=0x555556796150, name=0x555555c453d0 "realized", errp=0x7fffffffdd68) at /root/qemu-2017-0531/qemu/qom/object.c:1086
1086 ObjectProperty *prop = object_property_find(obj, name, errp);
1083 void object_property_set(Object *obj, Visitor *v, const char *name,
1084 Error **errp)
1085 {
1086 ObjectProperty *prop = object_property_find(obj, name, errp);
1087 if (prop == NULL) {
1088 return;
1089 }
1090
(gdb)
1091 if (!prop->set) {
1092 error_setg(errp, QERR_PERMISSION_DENIED);
1093 } else {
1094 prop->set(obj, v, name, prop->opaque, errp);
1095 }
1096 }
然后再到1094,
property_set_bool (obj=0x555556779580, v=0x555556796150, name=0x555555c453d0 "realized", opaque=0x55555673e240, errp=0x7fffffffdd68)
at /root/qemu-2017-0531/qemu/qom/object.c:1849
1849 {
(gdb) l
1844 visit_type_bool(v, name, &value, errp);
1845 }
1846
1847 static void property_set_bool(Object *obj, Visitor *v, const char *name,
1848 void *opaque, Error **errp)
1849 {
1850 BoolProperty *prop = opaque;
1851 bool value;
1852 Error *local_err = NULL;
1853
(gdb)
1854 visit_type_bool(v, name, &value, &local_err);
1855 if (local_err) {
1856 error_propagate(errp, local_err);
1857 return;
1858 }
1859
1860 prop->set(obj, value, errp);
1861 }
到1860,
device_set_realized (obj=0x555556779580, value=true, errp=0x7fffffffdd68) at /root/qemu-2017-0531/qemu/hw/core/qdev.c:879
879 {
(gdb) l
874
875 return true;
876 }
877
878 static void device_set_realized(Object *obj, bool value, Error **errp)
879 {
880 DeviceState *dev = DEVICE(obj);
881 DeviceClass *dc = DEVICE_GET_CLASS(dev);
882 HotplugHandler *hotplug_ctrl;
883 BusState *bus;
(gdb)
一直往下走,到917行,
915
916 if (dc->realize) {
917 dc->realize(dev, &local_err);
918 }
919
920 if (local_err != NULL) {
进去917,
x86_cpu_realizefn (dev=0x555556779580, errp=0x7fffffffdbb0) at /root/qemu-2017-0531/qemu/target/i386/cpu.c:3487
3487 {
(gdb) l
3482 (env)->cpuid_vendor3 == CPUID_VENDOR_INTEL_3)
3483 #define IS_AMD_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_AMD_1 && \
3484 (env)->cpuid_vendor2 == CPUID_VENDOR_AMD_2 && \
3485 (env)->cpuid_vendor3 == CPUID_VENDOR_AMD_3)
3486 static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
3487 {
3488 CPUState *cs = CPU(dev);
3489 X86CPU *cpu = X86_CPU(dev);
3490 X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);
3491 CPUX86State *env = &cpu->env;
(
这时候我们看到一个函数,x86_cpu_realizefn,在这个函数的展开里面,第3648行,这里,qemu如何创建vcpu终于露出真容了,
3648 qemu_init_vcpu(cs);
(gdb) s
qemu_init_vcpu (cpu=0x555556779580) at /root/qemu-2017-0531/qemu/cpus.c:1750
1750 cpu->nr_cores = smp_cores;
1748 void qemu_init_vcpu(CPUState *cpu)
1749 {
1750 cpu->nr_cores = smp_cores;
1751 cpu->nr_threads = smp_threads;
1752 cpu->stopped = true;
1753
1754 if (!cpu->as) {
(gdb)
1755 /* If the target cpu hasn't set up any address spaces itself,
1756 * give it the default one.
1757 */
1758 AddressSpace *as = address_space_init_shareable(cpu->memory,
1759 "cpu-memory");
1760 cpu->num_ases = 1;
1761 cpu_address_space_init(cpu, as, 0);
1762 }
1763
1764 if (kvm_enabled()) {
(gdb)
1765 qemu_kvm_start_vcpu(cpu);
1766 } else if (hax_enabled()) {
1767 qemu_hax_start_vcpu(cpu);
1768 } else if (tcg_enabled()) {
1769 qemu_tcg_init_vcpu(cpu);
1770 } else {
1771 qemu_dummy_start_vcpu(cpu);
1772 }
1773 }
第1764行开始,就是vcpu的创建过程,在enablekvm的情况下,调用1765行的qemu_kvm_start_vcpu,那么我们来看一下这个函数:
qemu_kvm_start_vcpu (cpu=0x555556779580) at /root/qemu-2017-0531/qemu/cpus.c:1717
1715
1716 static void qemu_kvm_start_vcpu(CPUState *cpu)
1717 {
1718 char thread_name[VCPU_THREAD_NAME_SIZE];
1719
1720 cpu->thread = g_malloc0(sizeof(QemuThread));
1721 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
(gdb)
1722 qemu_cond_init(cpu->halt_cond);
1723 snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
1724 cpu->cpu_index);
1725 qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
1726 cpu, QEMU_THREAD_JOINABLE);
1727 while (!cpu->created) {
1728 qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
1729 }
1730 }
喏,现在看清楚了吧,vcpu就是个线程,1725的qemu_thread_create我们再进去看看:
qemu_thread_create (thread=0x5555567a2210, name=0x7fffffffdaa0 "CPU 0/KVM", start_routine=0x555555791756 , arg=0x555556779580, mode=0)
at /root/qemu-2017-0531/qemu/util/qemu-thread-posix.c:468
465 void qemu_thread_create(QemuThread *thread, const char *name,
466 void *(*start_routine)(void*),
467 void *arg, int mode)
468 {
469 sigset_t set, oldset;
470 int err;
471 pthread_attr_t attr;
472
473 err = pthread_attr_init(&attr);
474 if (err) {
475 error_exit(err, __func__);
476 }
477
478 /* Leave signal handling to the iothread. */
479 sigfillset(&set);
480 pthread_sigmask(SIG_SETMASK, &set, &oldset);
481 err = pthread_create(&thread->thread, &attr, start_routine, arg);
482 if (err)
(gdb)
483 error_exit(err, __func__);
484
485 if (name_threads) {
486 qemu_thread_set_name(thread, name);
487 }
488
489 if (mode == QEMU_THREAD_DETACHED) {
490 err = pthread_detach(thread->thread);
491 if (err) {
492 error_exit(err, __func__);
(gdb)
493 }
494 }
495 pthread_sigmask(SIG_SETMASK, &oldset, NULL);
496
497 pthread_attr_destroy(&attr);
498 }
然后我们再来看一下这个qemu_kvm_cpu_thread_fn,在它里面的kvm_init_vcpu才是在enablekvm情况下最终来由kvm来完成的部分:
1092 static void *qemu_kvm_cpu_thread_fn(void *arg)
1093 {
1094 CPUState *cpu = arg;
1095 int r;
1096
1097 rcu_register_thread();
1098
1099 qemu_mutex_lock_iothread();
1100 qemu_thread_get_self(cpu->thread);
1101 cpu->thread_id = qemu_get_thread_id();
1102 cpu->can_do_io = 1;
1103 current_cpu = cpu;
1104
1105 r = kvm_init_vcpu(cpu);
1106 if (r < 0) {
1107 fprintf(stderr, "kvm_init_vcpu failed: %s\n", strerror(-r));
1108 exit(1);
1109 }
1110
1111 kvm_init_cpu_signals(cpu);
这里就不对kvm_init_vcpu来多做展开了。
然后我们让程序执行到底,发现:
[New Thread 0x7fffeffff700 (LWP 16755)]
Continuing.
[New Thread 0x7fffeffff700 (LWP 16756)]
[New Thread 0x7fffee1ff700 (LWP 16758)]
[New Thread 0x7fffed9fe700 (LWP 16759)]
VNC server running on ::1:5900
(qemu) info cpus* CPU #0: pc=0x00000000000082ea thread_id=16555
CPU #1: pc=0x00000000000fd406 (halted) thread_id=16756
(qemu) [Thread 0x7fffee1ff700 (LWP 16758) exited]
[root@localhost ~]# ps -ef | grep qemu
root 13695 13557 0 Jun07 pts/1 00:00:10 gdb x86_64-softmmu/qemu-system-x86_64
root 15616 13695 0 00:31 pts/1 00:00:16 /root/qemu/bin/debug/native/x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 2 -m 2048M -hda /root/test/rhel7_cpu2006.qcow -monitor stdio
root 16779 14422 0 01:25 pts/5 00:00:00 grep --color=auto qemu
[root@localhost ~]# pstree -p 15616
qemu-system-x86(15616)─┬─{qemu-system-x86}(15617)
├─{qemu-system-x86}(16555)
├─{qemu-system-x86}(16756)
└─{qemu-system-x86}(16759)
main(...) ==>machine_run_board_init(current_machine) ==> pc_init(...) ==> pc_init1(...) ==> pc_cpus_init(...) ==> pc_new_cpu(...)
==> object_property_set_bool(...) ==> object_property_set_bool(...) ==> object_property_set(...) ==> property_set_bool ==> device_set_realized
==> x86_cpu_realizefn ==> qemu_init_vcpu ==> qemu_kvm_start_vcpu ==> qemu_thread_create ==> qemu_kvm_cpu_thread_fn ==> kvm_init_vcpu
==>type_init(x86_cpu_register_types)
==>x86_cpu_register_types(void)
==> type_register_static(&x86_cpu_type_info);
==> static const TypeInfo x86_cpu_type_info = {}
==> .class_init = x86_cpu_common_class_init,
==> x86_cpu_common_class_init(ObjectClass *oc, void *data)
==> dc->realize = x86_cpu_realizefn;
==> x86_cpu_realizefn(DeviceState *dev, Error **errp)
仔细看源码会发现,qemu这帮人硬生生的用C语言实现了许多个类,还有他们的构造函数还有一堆模板什么的,我想说的是,你好好的用C++不好吗?
费劲巴拉的绕了一大圈,代码看的别扭死了,后边如果有时间,写一写kvm是如何实现vcpu的吧。