Linux System Calls (Linux系统调用)

POSIX APIs and System Calls

The difference between an API and a system call

API: Function definition that specifies how to obtain a given service

System Call: an explicit request to the kernel made via a software interrupt

 

Wrapper routine: routine whose only purpose is to issue a system call

 

POSIX standard refer to a set of APIs and not to system calls.

 

 

System Call Handler and System Routine

The conventions for return values of system calls are different from those of wrapper routines.

0 or positive integers indicate a successful termination of the system call while negative integers indicate a failure. It is the wrapper routines’ responsibility to set errno.

 

System call number: specify the system call to invoke

System call handler: similar to exception handlers

Save registers -> call system call service routine -> load registers and switch back to user mode

 

Naming rules:

System call: xxxx()

System service routine: sys_xxxx()

 

System call dispatch table

System call number ßà service routine

 

 

Entering and Exiting a system call

Two ways to enter and exit a system call

Enter: int $0x80                 Exit: iret

Enter: sysenter                  Exit: sysexit

 

0x80 (128) in Interrupt Descriptor Table

Set_system_gate(0x80, &system_call)

(See interrupt, trap and system gates in chapter 4)

 

 

 

Parameter passing

System call numbers: (example)

#define __NR_restart_syscall                   (__NR_SYSCALL_BASE+  0)

#define __NR_exit                     (__NR_SYSCALL_BASE+  1)

#define __NR_fork                     (__NR_SYSCALL_BASE+  2)

#define __NR_read                    (__NR_SYSCALL_BASE+  3)

#define __NR_write                            (__NR_SYSCALL_BASE+  4)

#define __NR_open                            (__NR_SYSCALL_BASE+  5)

#define __NR_close                            (__NR_SYSCALL_BASE+  6)

                                               /* 7 was sys_waitpid */

#define __NR_creat                            (__NR_SYSCALL_BASE+  8)

#define __NR_link                      (__NR_SYSCALL_BASE+  9)

#define __NR_unlink                           (__NR_SYSCALL_BASE+ 10)

#define __NR_execve                         (__NR_SYSCALL_BASE+ 11)

#define __NR_chdir                            (__NR_SYSCALL_BASE+ 12)

#define __NR_time                    (__NR_SYSCALL_BASE+ 13)

#define __NR_mknod                         (__NR_SYSCALL_BASE+ 14)

#define __NR_chmod                         (__NR_SYSCALL_BASE+ 15)

 

The system call number is set by wrapper routines, so the programmer usually does not need to care about it.

è Set eax register to the system call number

è Write parameters into CPU registers (we cannot write parameters into stacks as usual because the system call cross both kernel stack and user mode stack.

è The kernel copies the parameters into the kernel mode stack

è Call int 0x80 or sysenter

 

(Ordinary C functions use parameters via a stack, either kernel mode stack or user mode stack.)

 

Because we use registers to pass parameters, two conditions must be satisfied:

1.       The length of the parameter cannot exceed the length of a register.

2.       The number of parameters cannot exceed six, besides the system call number passed by eax.

 

These two conditions implies to two things. One, large parameters must be passed by reference. Two, if more than six parameters are needed, a single register is used to point to a memory area in the process address space. Of course, the programmer need not care about this workaround. The wrapper routine will find the appropriate way to pass the parameters to the kernel.

Kernel Wrapper Routines

Although system calls are mainly used by User Mode processes, they can also be invoked by kernel threads, which cannot use library functions.

 

 

 References:

Understanding the Linux Kernel, 3rd

 

 

你可能感兴趣的:(Linux System Calls (Linux系统调用))