Fuchsia源码分析--系统调用流程

Fuchsia源码分析--系统调用流程 以zx_channel_create为例

    • Fuchsia系统调用的定义
    • Fuchsia系统调用定义文件的编译
    • Fuchsia系统调用用户空间的调用流程
      • zx_channel_create函数的定义
      • zx_channel_create函数的实现
      • kazoo工具对系统调用文件的处理
    • Fuchsia系统调用Kernel空间的实现

最近一直在努力阅读Fuchsia的源码,但是说实话,Fuchsia目前以大量C11风格的代码,读起来还是挺费劲的,就好比当初刚开始读JAVA代码或者Python的代码的时候,希望经过坚持一段时间能够适应这个新的c++代码风格。
  一直在想以什么样的方式开始切入阅读整个的Fuchsia源码,尝试过比较多的突破点,最后还是选择下从系统调用入手,先把整个系统调用的流程整理清楚。所以暂时先现在zx_channel_create这样一个系统调用入手,从应用层一直追踪到kernel层,完成整个调用流程的梳理。这篇文章将从以下几个方面来梳理下Fuchsia中的系统调用。

  1. Fuchsia系统调用的定义
  2. Fuchsia系统调用定义文件的编译
  3. Fuchsia系统调用用户空间的调用流程
  4. Fuchsia系统调用Kernel空间的实现

Fuchsia系统调用的定义

我们能看到当前系统所有的系统调用的定义都放在fuchsia/zircon/vdso/目录中,如在这里插入图片描述
可以看到这个文件目录中有大量文件后缀名为.fidl的文件,我们要跟踪的zx_channel_create就是在channel.fidl中定义的,我们先打开这个文件看以下:

library zx;

using ObjType = uint32;

// TODO(scottmg): ZX_OBJ_TYPE_xyz here.

using HandleOp = uint32;

// TODO(scottmg): ZX_HANDLE_OP_xyz here.

struct HandleInfo {
    handle handle;
    ObjType type;
    rights rights;
    uint32 unused;
};

struct ChannelCallArgs {
    vector wr_bytes;
    vector wr_handles;
    // TODO(scottmg): mutable_vector_void
    vector rd_bytes;
    // TODO(scottmg): mutable_vector_handle
    vector rd_handles;
};

struct HandleDisposition {
    HandleOp operation;
    handle handle;
    ObjType type;
    rights rights;
    status result;
};

[Transport = "Syscall"]
protocol channel {
    /// Create a channel.
    channel_create(uint32 options) -> (status status, handle out0, handle out1);

    /// Read a message from a channel.
    /// Rights: handle must be of type ZX_OBJ_TYPE_CHANNEL and have ZX_RIGHT_READ.
    [ArgReorder = "handle, options, bytes, handles, num_bytes, num_handles, actual_bytes, actual_handles",
    HandleUnchecked]
    channel_read(handle handle,
                 uint32 options)
        -> (status status,
            vector_void_u32size bytes,
            vector_handle_u32size handles,
            optional_uint32 actual_bytes,
            optional_uint32 actual_handles);
........

我们可以看到这个fidl文件中有个比较重要的定义是[Transport = “Syscall”],在这个文件里我们有看到一个channel_create名字的定义。这个貌似跟我们前面说的zx_channel_create还有点区别,先别急,我们再找找。

Fuchsia系统调用定义文件的编译

我们从https://fuchsia.dev/fuchsia-src/reference/syscalls页面能够看到相关的介绍:
Syscall support is generated from //zircon/syscalls. The FIDL files in that directory are first run through fidlc which produces an intermediate format. That intermediate format is consumed by kazoo which produces output for both the kernel and userspace in a variety of languages. This output includes C or C++ headers for both the kernel and userspace, syscall entry points, other language bindings, and so on.

This tool is invoked as a part of the build, rather than checking in its output.

这段话的意思zircon/syscalls目录中的FIDL 文件首先会被fidlc生产一些中间格式,因为这些FIDL 文件中带有[Transport = “Syscall”],是被定义成系统调用的,所以会让kazoo工具会将这些中间格式生产kernel and userspace使用的c/c++头文件,在这里我们先不去计较哪些文件是kazoo工具生产的,也不区计较FIDL 文件是怎么被fidlc工具所使用。这些我们后面再来一一关注。

Fuchsia系统调用用户空间的调用流程

正常在userspace中,我们随便找一个文件,例如src/lib/fsl/io/device_watcher.cc,从这里我们开始追踪。

 zx::channel client, server;
  if (zx::channel::create(0, &client, &server) != ZX_OK) {
    return nullptr;
  }

我们看到是通过zx::channel::create来调用的,channel是一个继承自object的类,这个也符合fuchsia是一个以C++实现的kernel的特征,一切都是对象。
我们再找到channel类的实现zircon/system/ulib/zx/channel.cc:

#include 

#include 

namespace zx {

zx_status_t channel::create(uint32_t flags, channel* endpoint0, channel* endpoint1) {
  // Ensure aliasing of both out parameters to the same container
  // has a well-defined result, and does not leak.
  channel h0;
  channel h1;
  zx_status_t status =
      zx_channel_create(flags, h0.reset_and_get_address(), h1.reset_and_get_address());
  endpoint0->reset(h0.release());
  endpoint1->reset(h1.release());
  return status;
}

}  // namespace zx

在这里我们终于第一次看到zx_channel_create这个函数了,惊不惊喜?接下来我们就来看看几个方面的内容:
1)zx_channel_create这个函数是在哪里定义的。
2)zx_channel_create这个函数是在哪里实现的。
2)zx_channel_create这个函数与前面channel.fidl中所定义的channel_create有什么样的关系。

我们先来关注第一个方面:

zx_channel_create函数的定义

我们注意到zircon/system/ulib/zx/channel.cc有引用一个头文件:

#include 

这个头文件的位置在于:./zircon/system/public/zircon/syscalls.h
这个头文件是这样写的:


#ifndef SYSROOT_ZIRCON_SYSCALLS_H_
#define SYSROOT_ZIRCON_SYSCALLS_H_

#include 
#include 
#include 
#include 
#include 
#include 

__BEGIN_CDECLS

#define _ZX_SYSCALL_DECL(name, type, attrs, nargs, arglist, prototype) \
  extern attrs type zx_##name prototype;                               \
  extern attrs type _zx_##name prototype;

#ifdef __clang__
#define _ZX_SYSCALL_ANNO(attr) __attribute__((attr))
#else
#define _ZX_SYSCALL_ANNO(attr)  // Nothing for compilers without the support.
#endif

#include 

#undef _ZX_SYSCALL_ANNO
#undef _ZX_SYSCALL_DECL

// Compatibility wrappers for deprecated syscalls also go here, when
// there are any.

// This DEPRECATED interface is replaced by zx_system_get_version_string.
zx_status_t zx_system_get_version(char* version, size_t version_size) __LEAF_FN;
zx_status_t _zx_system_get_version(char* version, size_t version_size) __LEAF_FN;

__END_CDECLS

#endif  // SYSROOT_ZIRCON_SYSCALLS_H_

这个里面定义了一个_ZX_SYSCALL_DECL的宏,也引用了一个inc头文件:

#include 

这个头文件位置就在于:./out/default.zircon/gen/include/zircon/syscalls/internal/cdecls.inc
一看就是一个编译过程中所生成的文件。
打开这个文件看看,我们有找到跟channel_create相关的内容:

_ZX_SYSCALL_DECL(channel_create, zx_status_t, /* no attributes */, 3,
    (options, out0, out1), (
    uint32_t options,
    _ZX_SYSCALL_ANNO(acquire_handle("Fuchsia")) zx_handle_t* out0,
    _ZX_SYSCALL_ANNO(acquire_handle("Fuchsia")) zx_handle_t* out1))

_ZX_SYSCALL_DECL这个宏定义在./zircon/system/public/zircon/syscalls.h中,也就在引用#include 的前面。所以在编译过程中cdecls.inc文件中的内容会被引用进syscalls.h中,而且通过宏展开,展开之后会形成一个zx_channel_create的函数定义。我们看下_ZX_SYSCALL_DECL的宏定义

#define _ZX_SYSCALL_DECL(name, type, attrs, nargs, arglist, prototype) \
  extern attrs type zx_##name prototype;                               \
  extern attrs type _zx_##name prototype;

对着cdecls.inc中的内容,我们看看是不是这里定义了zx_channel_create函数。

我们再来看看zx_channel_create函数是在哪里实现的:

zx_channel_create函数的实现

找这个函数的实现还真的费了一番手脚,主要是这个实现方式比较巧妙,让人拍案叫绝,真的可以仔细把玩,慢慢品味,没想到代码还是这么玩。

首先我们找到./zircon/system/ulib/zircon/syscalls-x86.S这个文件,因为是在x86的环境上跟踪fuchsia的源码,如果是在arm环境上下,可以去看这个目录下的arm版本的实现。

// Copyright 2016 The Fuchsia Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

/* define and implement the zircon syscall wrappers for x86-64 */

#include "syscall-entry.h"
#include "zircon-syscall-x86.S"

.text

.cfi_sections .eh_frame, .debug_frame

// The following assembly code converts arguments from the x86-64 SysV
// ABI's function calling conventions to the conventions used for Zircon
// syscalls:
//
//   arg 1: stays in %rdi
//   arg 2: stays in %rsi
//   arg 3: stays in %rdx
//   arg 4: moved from %rcx to %r10
//   arg 5: stays in %r8
//   arg 6: stays in %r9
//   arg 7: moved from 8(%rsp) to %r12
//   arg 8: moved from 16(%rsp) to %r13

.macro m_syscall name, num, nargs, public
syscall_entry_begin \name
    .cfi_same_value %r12
    .cfi_same_value %r13
.if \nargs <= 3
    zircon_syscall \num, \name, \name
    ret
.elseif \nargs <= 6
    mov      %rcx, %r10  // Argument 4
    zircon_syscall \num, \name, \name
    ret
.elseif \nargs == 7
    push_reg %r12
    mov      0x10(%rsp), %r12  // Argument 7
    mov      %rcx, %r10  // Argument 4
    zircon_syscall \num, \name, \name
    pop_reg  %r12
    ret
.elseif \nargs == 8
    push_reg %r12
    push_reg %r13
    mov      0x18(%rsp), %r12  // Argument 7
    mov      0x20(%rsp), %r13  // Argument 8
    mov      %rcx, %r10  // Argument 4
    zircon_syscall \num, \name, \name
    pop_reg  %r13
    pop_reg  %r12
    ret
.endif
syscall_entry_end \name \public
.endm

#include "syscalls-stubs.S"

这个汇编代码文件分别在文件头引入了:

#include "syscall-entry.h"
#include "zircon-syscall-x86.S"

在文件尾部引入了:

#include "syscalls-stubs.S"

我们仔细看看,这个文件./zircon/system/ulib/zircon/syscalls-x86.S中的代码,定义有一个宏:

.macro m_syscall name, num, nargs, public
syscall_entry_begin \name
    .cfi_same_value %r12
    .cfi_same_value %r13
.if \nargs <= 3
    zircon_syscall \num, \name, \name
    ret
.elseif \nargs <= 6
    mov      %rcx, %r10  // Argument 4
    zircon_syscall \num, \name, \name
    ret
.elseif \nargs == 7
    push_reg %r12
    mov      0x10(%rsp), %r12  // Argument 7
    mov      %rcx, %r10  // Argument 4
    zircon_syscall \num, \name, \name
    pop_reg  %r12
    ret
.elseif \nargs == 8
    push_reg %r12
    push_reg %r13
    mov      0x18(%rsp), %r12  // Argument 7
    mov      0x20(%rsp), %r13  // Argument 8
    mov      %rcx, %r10  // Argument 4
    zircon_syscall \num, \name, \name
    pop_reg  %r13
    pop_reg  %r12
    ret
.endif
syscall_entry_end \name \public
.endm

这个宏基本就是这个文件的主体,而这个宏又引入了三个新宏:syscall_entry_begin与syscall_entry_end跟zircon_syscall,而syscall_entry_begin宏与syscall_entry_end定义在./zircon/system/ulib/zircon/syscall-entry.h中:

macro syscall_entry_begin name
.globl SYSCALL_\name
.hidden SYSCALL_\name
.type SYSCALL_\name,STT_FUNC
SYSCALL_\name:
.cfi_startproc
.endm

.macro syscall_entry_end name public=1
.cfi_endproc
.size SYSCALL_\name, . - SYSCALL_\name

// Create a hidden alias for the syscall which is prefixed with CODE_.  This
// allows the macros which perform redirection in the kernel to redirect a VDSO
// entry to either an explicit CODE_ alternate, or to another syscall if needed.
.globl CODE_SYSCALL_\name
.hidden CODE_SYSCALL_\name
CODE_SYSCALL_\name = SYSCALL_\name
.size CODE_SYSCALL_\name, . - SYSCALL_\name

// For wrapper functions, aliasing is handled by the generator.
.if \public
.globl _\name
.type _\name,STT_FUNC
_\name = SYSCALL_\name
.size _\name, . - SYSCALL_\name

.weak \name
.type \name,STT_FUNC
\name = SYSCALL_\name
.size \name, . - SYSCALL_\name

.globl VDSO_\name
.hidden VDSO_\name
.type VDSO_\name,STT_FUNC
VDSO_\name = SYSCALL_\name
.size VDSO_\name, . - SYSCALL_\name
.endif

.endm

syscall_entry_begin宏根据所传递进来的name生成SYSCALL_name的符号,这个符号被定义成函数类型,syscall_entry_end宏则会生成name为命名形式的函数符号,并且作为SYSCALL_name的别名。

而宏zircon_syscall则是定义在./zircon/system/ulib/zircon/zircon-syscall-x86.S中:

.macro zircon_syscall num, name, caller
    mov $\num, %eax
    syscall
// This symbol at the return address identifies this as an approved call site.
    .hidden CODE_SYSRET_\name\()_VIA_\caller
CODE_SYSRET_\name\()_VIA_\caller\():
.endm

// CFI aware push and pop macros.
.macro push_reg reg
    push \reg
    .cfi_adjust_cfa_offset 8
    .cfi_rel_offset \reg, 0
.endm
.macro pop_reg reg
    pop \reg
    .cfi_adjust_cfa_offset -8
    .cfi_same_value \reg
.endm

这些我们先都放在这,先看下这个汇编文件在文件尾部引入的
./zircon/system/ulib/zircon/syscalls-stubs.S:

// Copyright 2016 The Fuchsia Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

#include 

// One of these macros is invoked by syscalls.inc for each syscall.

// These don't have kernel entry points.
#define VDSO_SYSCALL(...)

// These are the direct kernel entry points.
#define KERNEL_SYSCALL(name, type, attrs, nargs, arglist, prototype) \
  m_syscall zx_##name, ZX_SYS_##name, nargs, 1

// These are internal kernel entry points called by other vDSO functions.
#define INTERNAL_SYSCALL(name, type, attrs, nargs, arglist, prototype) \
  m_syscall zx_##name, ZX_SYS_##name, nargs, 0
#define BLOCKING_SYSCALL(...) INTERNAL_SYSCALL(__VA_ARGS__)

#include 

#undef VDSO_SYSCALL
#undef KERNEL_SYSCALL
#undef INTERNAL_SYSCALL
#undef BLOCKING_SYSCALL

// Compatibility aliases

#define ALIAS(oldname, newname) \
.globl oldname ;\
.type oldname,STT_FUNC ;\
oldname = newname ;\

这个汇编文件里定义了一个新的宏KERNEL_SYSCALL,这个宏展开会调用上面所定义的m_syscall宏。同时这个汇编文件中还引入了一个syscalls.inc的头文件:
./out/default.zircon/gen/vdso/include/lib/syscalls/syscalls.inc

KERNEL_SYSCALL(channel_create, zx_status_t, /* no attributes */, 3,
    (options, out0, out1), (
    uint32_t options,
    _ZX_SYSCALL_ANNO(acquire_handle("Fuchsia")) zx_handle_t* out0,
    _ZX_SYSCALL_ANNO(acquire_handle("Fuchsia")) zx_handle_t* out1))

在这里,我们想象下编译后展开的./zircon/system/ulib/zircon/syscalls-x86.S文件,文件KERNEL_SYSCALL展开m_syscall宏:

// These are the direct kernel entry points.
#define KERNEL_SYSCALL(name, type, attrs, nargs, arglist, prototype) \
  m_syscall zx_##name, ZX_SYS_##name, nargs, 1

原本名字channel_create被扩展成zx_channel_create,熟悉了吧?
m_syscall被展开后,通过syscall_entry_begin宏定义出一个SYSCALL_zx_channel_create的名字,而且这个名字被定义成函数类型:

macro syscall_entry_begin name
.globl SYSCALL_\name
.hidden SYSCALL_\name
.type SYSCALL_\name,STT_FUNC
SYSCALL_\name:
.cfi_startproc
.endm

之后还通过zircon_syscall宏定义了SYSCALL_zx_channel_create函数的实现:

.macro zircon_syscall num, name, caller
    mov $\num, %eax
    syscall
// This symbol at the return address identifies this as an approved call site.
    .hidden CODE_SYSRET_\name\()_VIA_\caller
CODE_SYSRET_\name\()_VIA_\caller\():
.endm

// CFI aware push and pop macros.
.macro push_reg reg
    push \reg
    .cfi_adjust_cfa_offset 8
    .cfi_rel_offset \reg, 0
.endm
.macro pop_reg reg
    pop \reg
    .cfi_adjust_cfa_offset -8
    .cfi_same_value \reg
.endm

m_syscall宏还通过syscall_entry_end宏定义了SYSCALL_zx_channel_create函数的别名:

/ For wrapper functions, aliasing is handled by the generator.
.if \public
.globl _\name
.type _\name,STT_FUNC
_\name = SYSCALL_\name
.size _\name, . - SYSCALL_\name

.weak \name
.type \name,STT_FUNC
\name = SYSCALL_\name
.size \name, . - SYSCALL_\name

这个别名就是zx_channel_create哈。这样这个函数的实现就算是找到了吧????尼玛这个巧妙吧?最终这个函数的实现就是通过x86上的syscall指令进行系统调用,syscall指令中eax中的参数就是系统调用号。

kazoo工具对系统调用文件的处理

前面有提到kazoo工具会去生成系统调用相关的文件,在这里基本有这些:
1)./out/default.zircon/gen/vdso/include/lib/syscalls/syscalls.inc
2)./out/default.zircon/gen/vdso/include/lib/syscalls/kernel.inc
这两个文件在前面都有提到,这里久先不赘述了,简单猜猜下kazoo是根据zircon/vdso/中fidl文件来生成这些头文件的。

至此,userspace当中的系统调用的流程基本走完了,涉及到了系统调用函数的定义与实现,以及调用方式,接下来我们看看kernel space中怎么来响应userspace的系统调用。

Fuchsia系统调用Kernel空间的实现

kernel空间我们首先看./zircon/kernel/arch/x86/mp.cc:
这段代码是在系统启动阶段会被调用,其中跟系统调用相关的代码就是下面这一行:

 /* load the syscall entry point */
  write_msr(X86_MSR_IA32_LSTAR, (uint64_t)&x86_syscall);

x86_init_percpu函数被调用的过程可以参考之前写的Fuchsia X86 kernel启动代码分析

这行代码是把系统调用的响应函数x86_syscall地址写进了对应的MSR寄存器当中。
在zircon/kernel/arch/x86/syscall.S中,会引入./out/default.zircon/gen/vdso/include/lib/syscalls/kernel.inc,

这个头文件里会这样写:

KERNEL_SYSCALL(channel_create, zx_status_t, /* no attributes */, 3,
    (options, out0, out1), (
    uint32_t options,
    _ZX_SYSCALL_ANNO(acquire_handle("Fuchsia")) zx_handle_t* out0,
    _ZX_SYSCALL_ANNO(acquire_handle("Fuchsia")) zx_handle_t* out1))

KERNEL_SYSCALL这个在前面讲userspace中系统调用函数的定义与实现的时候碰到过的吧,但是这个宏展开就不能安装之前的宏定义展开了,因为在kernel space中这个宏有了新的定义:

// These are the direct kernel entry points.
#define KERNEL_SYSCALL(name, type, attrs, nargs, arglist, prototype) \
  syscall_dispatch nargs, name

syscall_dispatch宏展开:

/ Adds a label for making the syscall and adds it to the jump table.
.macro syscall_dispatch nargs, syscall
    .pushsection .text.syscall-dispatch,"ax",%progbits
    LOCAL_FUNCTION(.Lcall_\syscall\())
        // See x86_syscall for why this is here.
        cfi_outermost_frame
        pre_\nargs\()_args
        call wrapper_\syscall
        post_\nargs\()_args
    END_FUNCTION(.Lcall_\syscall\())
    .popsection
    .pushsection .rodata.syscall-table,"a",%progbits
        .quad .Lcall_\syscall
    .popsection
.endm

展开后就定义了call_zx_channel_create的函数,这个函数放在elf文件的rodata.syscall-table section当中,而在这个函数里会去调用wrapper_zx_channel_create函数。
zircon/kernel/arch/x86/syscall.S文件在引入kernel.inc头文件之前会通过展开start_syscall_dispatch宏的形式先去创建elf文件的.rodata.syscall-table section:

// Adds the label for the jump table.
.macro start_syscall_dispatch
    .pushsection .rodata.syscall-table,"a",%progbits
    .balign 8
    .Lcall_wrapper_table:
    .popsection
.endm

这个elf的section相当与一个函数指针表,指向各个具体实现的函数,这个函数指针表会严格按照相应的顺序。
当有系统调用产生的时候:

  cmp     $ZX_SYS_COUNT, %rax
    jae     .Lunknown_syscall
    leaq    .Lcall_wrapper_table(%rip), %r11
    movq    (%r11,%rax,8), %r11
    // LFENCE stalls dispatch until outcome of bounds check is resolved and system call handler
    // is known, preventing a Spectre V1 and V2 attack at this site.
    lfence
    jmp     *%r11

rax中存放的是系统调用序号,如果系统调用序号大于ZX_SYS_COUNT则是未知的系统调用,否则找到call_wrapper_table函数表的相对位置,这个相对位置加上8乘上调用序号,64位系统上指针大小都是8个字节,这个相当于找到系统调用序号对应到的处理函数,这里我们找到的是wrapper_zx_channel_create函数:
./out/default.zircon/gen/vdso/include/lib/syscalls/kernel-wrappers.inc

syscall_result wrapper_channel_create(uint32_t options, zx_handle_t* out0, zx_handle_t* out1, uint64_t pc) {
    return do_syscall(ZX_SYS_channel_create, pc, &VDso::ValidSyscallPC::channel_create, [&](ProcessDispatcher* current_process) -> uint64_t {
        user_out_handle out_handle_out0;
        user_out_handle out_handle_out1;
        auto result = sys_channel_create(options, &out_handle_out0, &out_handle_out1);
        if (result != ZX_OK)
            return result;
        if (out_handle_out0.begin_copyout(current_process, make_user_out_ptr(out0)))
            return ZX_ERR_INVALID_ARGS;
        if (out_handle_out1.begin_copyout(current_process, make_user_out_ptr(out1)))
            return ZX_ERR_INVALID_ARGS;
        out_handle_out0.finish_copyout(current_process);
        out_handle_out1.finish_copyout(current_process);
        return result;
    });
}

最终sys_channel_create函数实现在./zircon/kernel/lib/syscalls/channel.cc中,至此完成整个的跟踪过程。

你可能感兴趣的:(Fuchsia,Fuchsia,操作系统,系统调用)