第九章 阻塞进程
9.1阻塞进程
当有人需要你做某些事但是你不能马上做的时候你会怎么办?如果你是一个人类并且你也被另一个人打扰,你需要做的仅仅是说一声:“现在,我很忙,走开”。但是如果你是一个内核模块,并且你被一个进程打扰了,你就另外一种可能了。你能把进程sleep直到你能服务他的时候。毕竟,进程被内核sleep后能随时醒来,(这也是单核CPU在同时处理多进程的方法。
这个内核模块仅仅是一个例子。文件(在/proc/sleep
)仅仅能在同一时间被一个进程打开。如果这个文件已经打开了,内核模块就调用wait_event_interruptible
.这个函数改变任务(这个任务是内核数据的结构,这个结构有进程和系统调用的信息在里面)状态到TASK_INTERRUPTIBLE
,这意味着这个任务将不会运行,直到它被唤醒,然后加入到等待队列(WaitQ),这个任务队列等待着进入到文件。然后,这个函数调用调度程序选择一个进程。
当一个进程用完这个文件,它就关闭它,然后调用module_close
.那个函数唤起所有在队列中的进程(现在还没有机制仅只唤起他们中的一个)。然后返回并且这个刚关闭文件的这个进程就继续运行。适当时候,任务调度器决定:目前这个进程已经运行够了,它现在需要将cpu的控制权給另外一个进程了。最终,有一个曾在等待队列里的进程会被任务调度器赋予CPU的控制权。它在调用了module_interruptible_sleep_on
了之后马上开始运行。它随后进行设置全局变量来告诉所有的其他进程,那个文件还是打开的,然后继续运行它自己的。当其他进程获得了CPU的一部分时候,他们就会看到全局变量并继续回去sleep。
所以我们能用tail -f
来保持在后台文件打开,当尝试用另外一个进程进入的时候(再一次在后台,那么我们不需要选择一个不同vt(虚拟终端?))。一旦第一个后台进程用%1
被kill掉了,第二个就被唤起了,它就能进入到文件然后终结它。
为了使我们的生活更有趣,module_close
并不是一个垄断性的唤醒那个等待进入的进程函数。一个信号,例如Ctrl+c
(SIGINT)也能唤醒一个进程。在那个例子中,我们想要用(-EINTR
)立马返回。这个很重要的一点是用户能在收到这个文件之前kill掉进程。
这里不只一个要记住的点。有时进程不想要sleep,他们想既立马得到想要的,又被告知他们做不到。这样的进程当他们打开程序时候,使用O_NONBLOCK
标志。这个内核应该通过返回错误码:-EAGAIN
来由其他方面的阻塞操作来回应,例如打开的文件就是这样的例子。这个程序cat_noblock
这个可以在源码目录得到,对于这一章的可以用O_NONBLOCK
来打开文件。
hostname:~/lkmpg-examples/09-BlockingProcesses# insmod sleep.ko
hostname:~/lkmpg-examples/09-BlockingProcesses# cat_noblock /proc/sleep
Last input:
hostname:~/lkmpg-examples/09-BlockingProcesses# tail -f /proc/sleep &
Last input:
Last input:
Last input:
Last input:
Last input:
Last input:
Last input:
tail: /proc/sleep: file truncated
[1] 6540
hostname:~/lkmpg-examples/09-BlockingProcesses# cat_noblock /proc/sleep
Open would block
hostname:~/lkmpg-examples/09-BlockingProcesses# kill %1
[1]+ Terminated tail -f /proc/sleep
hostname:~/lkmpg-examples/09-BlockingProcesses# cat_noblock /proc/sleep
Last input:
hostname:~/lkmpg-examples/09-BlockingProcesses#
例子:9-1.sleep.c
/*
* sleep.c - create a /proc file, and if several processes try to open it at
* the same time, put all but one to sleep
*/
#include <linux/kernel.h> /* We're doing kernel work */
#include <linux/module.h> /* Specifically, a module */
#include <linux/proc_fs.h> /* Necessary because we use proc fs */
#include <linux/sched.h> /* For putting processes to sleep and
waking them up */
#include <asm/uaccess.h> /* for get_user and put_user */
/*
* The module's file functions
*/
/*
* Here we keep the last message received, to prove that we can process our
* input
*/
#define MESSAGE_LENGTH 80
static char Message[MESSAGE_LENGTH];
static struct proc_dir_entry *Our_Proc_File;
#define PROC_ENTRY_FILENAME "sleep"
/*
* Since we use the file operations struct, we can't use the special proc
* output provisions - we have to use a standard read function, which is this
* function
*/
static ssize_t module_output(struct file *file, /* see include/linux/fs.h */
char *buf, /* The buffer to put data to
(in the user segment) */
size_t len, /* The length of the buffer */
loff_t * offset)
{
static int finished = 0;
int i;
char message[MESSAGE_LENGTH + 30];
/*
* Return 0 to signify end of file - that we have nothing
* more to say at this point.
*/
if (finished) {
finished = 0;
return 0;
}
/*
* If you don't understand this by now, you're hopeless as a kernel
* programmer.
*/
sprintf(message, "Last input:%s\n", Message);
for (i = 0; i < len && message[i]; i++)
put_user(message[i], buf + i);
finished = 1;
return i; /* Return the number of bytes "read" */
}
/*
* This function receives input from the user when the user writes to the /proc
* file.
*/
static ssize_t module_input(struct file *file, /* The file itself */
const char *buf, /* The buffer with input */
size_t length, /* The buffer's length */
loff_t * offset)
{ /* offset to file - ignore */
int i;
/*
* Put the input into Message, where module_output will later be
* able to use it
*/
for (i = 0; i < MESSAGE_LENGTH - 1 && i < length; i++)
get_user(Message[i], buf + i);
/*
* we want a standard, zero terminated string
*/
Message[i] = '\0';
/*
* We need to return the number of input characters used
*/
return i;
}
/*
* 1 if the file is currently open by somebody
*/
int Already_Open = 0;
/*
* Queue of processes who want our file
*/
DECLARE_WAIT_QUEUE_HEAD(WaitQ);
/*
* Called when the /proc file is opened
*/
static int module_open(struct inode *inode, struct file *file)
{
/*
* If the file's flags include O_NONBLOCK, it means the process doesn't
* want to wait for the file. In this case, if the file is already
* open, we should fail with -EAGAIN, meaning "you'll have to try
* again", instead of blocking a process which would rather stay awake.
*/
if ((file->f_flags & O_NONBLOCK) && Already_Open)
return -EAGAIN;
/*
* This is the correct place for try_module_get(THIS_MODULE) because
* if a process is in the loop, which is within the kernel module,
* the kernel module must not be removed.
*/
try_module_get(THIS_MODULE);
/*
* If the file is already open, wait until it isn't
*/
while (Already_Open) {
int i, is_sig = 0;
/*
* This function puts the current process, including any system
* calls, such as us, to sleep. Execution will be resumed right
* after the function call, either because somebody called
* wake_up(&WaitQ) (only module_close does that, when the file
* is closed) or when a signal, such as Ctrl-C, is sent
* to the process
*/
wait_event_interruptible(WaitQ, !Already_Open);
/*
* If we woke up because we got a signal we're not blocking,
* return -EINTR (fail the system call). This allows processes
* to be killed or stopped.
*/
/*
* Emmanuel Papirakis:
*
* This is a little update to work with 2.2.*. Signals now are contained in
* two words (64 bits) and are stored in a structure that contains an array of
* two unsigned longs. We now have to make 2 checks in our if.
*
* Ori Pomerantz:
*
* Nobody promised me they'll never use more than 64 bits, or that this book
* won't be used for a version of Linux with a word size of 16 bits. This code
* would work in any case.
*/
for (i = 0; i < _NSIG_WORDS && !is_sig; i++)
is_sig =
current->pending.signal.sig[i] & ~current->
blocked.sig[i];
if (is_sig) {
/*
* It's important to put module_put(THIS_MODULE) here,
* because for processes where the open is interrupted
* there will never be a corresponding close. If we
* don't decrement the usage count here, we will be
* left with a positive usage count which we'll have no
* way to bring down to zero, giving us an immortal
* module, which can only be killed by rebooting
* the machine.
*/
module_put(THIS_MODULE);
return -EINTR;
}
}
/*
* If we got here, Already_Open must be zero
*/
/*
* Open the file
*/
Already_Open = 1;
return 0; /* Allow the access */
}
/*
* Called when the /proc file is closed
*/
int module_close(struct inode *inode, struct file *file)
{
/*
* Set Already_Open to zero, so one of the processes in the WaitQ will
* be able to set Already_Open back to one and to open the file. All
* the other processes will be called when Already_Open is back to one,
* so they'll go back to sleep.
*/
Already_Open = 0;
/*
* Wake up all the processes in WaitQ, so if anybody is waiting for the
* file, they can have it.
*/
wake_up(&WaitQ);
module_put(THIS_MODULE);
return 0; /* success */
}
/*
* This function decides whether to allow an operation (return zero) or not
* allow it (return a non-zero which indicates why it is not allowed).
*
* The operation can be one of the following values:
* 0 - Execute (run the "file" - meaningless in our case)
* 2 - Write (input to the kernel module)
* 4 - Read (output from the kernel module)
*
* This is the real function that checks file permissions. The permissions
* returned by ls -l are for reference only, and can be overridden here.
*/
static int module_permission(struct inode *inode, int op, struct nameidata *nd)
{
/*
* We allow everybody to read from our module, but only root (uid 0)
* may write to it
*/
if (op == 4 || (op == 2 && current->euid == 0))
return 0;
/*
* If it's anything else, access is denied
*/
return -EACCES;
}
/*
* Structures to register as the /proc file, with pointers to all the relevant
* functions.
*/
/*
* File operations for our proc file. This is where we place pointers to all
* the functions called when somebody tries to do something to our file. NULL
* means we don't want to deal with something.
*/
static struct file_operations File_Ops_4_Our_Proc_File = {
.read = module_output, /* "read" from the file */
.write = module_input, /* "write" to the file */
.open = module_open, /* called when the /proc file is opened */
.release = module_close, /* called when it's closed */
};
/*
* Inode operations for our proc file. We need it so we'll have somewhere to
* specify the file operations structure we want to use, and the function we
* use for permissions. It's also possible to specify functions to be called
* for anything else which could be done to an inode (although we don't bother,
* we just put NULL).
*/
static struct inode_operations Inode_Ops_4_Our_Proc_File = {
.permission = module_permission, /* check for permissions */
};
/*
* Module initialization and cleanup
*/
/*
* Initialize the module - register the proc file
*/
int init_module()
{
Our_Proc_File = create_proc_entry(PROC_ENTRY_FILENAME, 0644, NULL);
if (Our_Proc_File == NULL) {
remove_proc_entry(PROC_ENTRY_FILENAME, &proc_root);
printk(KERN_ALERT "Error: Could not initialize /proc/test\n");
return -ENOMEM;
}
Our_Proc_File->owner = THIS_MODULE;
Our_Proc_File->proc_iops = &Inode_Ops_4_Our_Proc_File;
Our_Proc_File->proc_fops = &File_Ops_4_Our_Proc_File;
Our_Proc_File->mode = S_IFREG | S_IRUGO | S_IWUSR;
Our_Proc_File->uid = 0;
Our_Proc_File->gid = 0;
Our_Proc_File->size = 80;
printk(KERN_INFO "/proc/test created\n");
return 0;
}
/*
* Cleanup - unregister our file from /proc. This could get dangerous if
* there are still processes waiting in WaitQ, because they are inside our
* open function, which will get unloaded. I'll explain how to avoid removal
* of a kernel module in such a case in chapter 10.
*/
void cleanup_module()
{
remove_proc_entry(PROC_ENTRY_FILENAME, &proc_root);
printk(KERN_INFO "/proc/test removed\n");
}
例子9-2.cat_noblock.c
/* cat_noblock.c - open a file and display its contents, but exit rather than
* wait for input */
/* Copyright (C) 1998 by Ori Pomerantz */
#include <stdio.h> /* standard I/O */
#include <fcntl.h> /* for open */
#include <unistd.h> /* for read */
#include <stdlib.h> /* for exit */
#include <errno.h> /* for errno */
#define MAX_BYTES 1024*4
main(int argc, char *argv[])
{
int fd; /* The file descriptor for the file to read */
size_t bytes; /* The number of bytes read */
char buffer[MAX_BYTES]; /* The buffer for the bytes */
/* Usage */
if (argc != 2) {
printf("Usage: %s <filename>\n", argv[0]);
puts("Reads the content of a file, but doesn't wait for input");
exit(-1);
}
/* Open the file for reading in non blocking mode */
fd = open(argv[1], O_RDONLY | O_NONBLOCK);
/* If open failed */
if (fd == -1) {
if (errno = EAGAIN)
puts("Open would block");
else
puts("Open failed");
exit(-1);
}
/* Read the file and output its contents */
do {
int i;
/* Read characters from the file */
bytes = read(fd, buffer, MAX_BYTES);
/* If there's an error, report it and die */
if (bytes == -1) {
if (errno = EAGAIN)
puts("Normally I'd block, but you told me not to");
else
puts("Another read error");
exit(-1);
}
/* Print the characters */
if (bytes > 0) {
for(i=0; i<bytes; i++)
putchar(buffer[i]);
}
/* While there are no errors and the file isn't over */
} while (bytes > 0);
}
第十章.替换Printks
10.1.替换printk
在1.2.1.2那一节,我说X和内核模块编程不要混合。对于开发内核模块这是真理,但是在实际应用的时候,你想要能发送消息到哪个tty(显示终端命令)命令来装载来自哪里的模块。
这个方法是通过用current来实现的,一个指针指向当前的运行的任务,然后得到当前任务的tty结构。然后,我们看到tty结构里面,找到一个指针指向一个string写了一个函数,我们用这个函数来写string到tty。
例子10-1.print_string.c
/*
* print_string.c - Send output to the tty we're running on, regardless if it's
* through X11, telnet, etc. We do this by printing the string to the tty
* associated with the current task.
*/
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/sched.h> /* For current */
#include <linux/tty.h> /* For the tty declarations */
#include <linux/version.h> /* For LINUX_VERSION_CODE */
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Peter Jay Salzman");
static void print_string(char *str)
{
struct tty_struct *my_tty;
/*
* tty struct went into signal struct in 2.6.6
*/
#if ( LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,5) )
/*
* The tty for the current task
*/
my_tty = current->tty;
#else
/*
* The tty for the current task, for 2.6.6+ kernels
*/
my_tty = current->signal->tty;
#endif
/*
* If my_tty is NULL, the current task has no tty you can print to
* (ie, if it's a daemon). If so, there's nothing we can do.
*/
if (my_tty != NULL) {
/*
* my_tty->driver is a struct which holds the tty's functions,
* one of which (write) is used to write strings to the tty.
* It can be used to take a string either from the user's or
* kernel's memory segment.
*
* The function's 1st parameter is the tty to write to,
* because the same function would normally be used for all
* tty's of a certain type. The 2nd parameter controls
* whether the function receives a string from kernel
* memory (false, 0) or from user memory (true, non zero).
* BTW: this param has been removed in Kernels > 2.6.9
* The (2nd) 3rd parameter is a pointer to a string.
* The (3rd) 4th parameter is the length of the string.
*
* As you will see below, sometimes it's necessary to use
* preprocessor stuff to create code that works for different
* kernel versions. The (naive) approach we've taken here
* does not scale well. The right way to deal with this
* is described in section 2 of
* linux/Documentation/SubmittingPatches
*/
((my_tty->driver)->write) (my_tty, /* The tty itself */
#if ( LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,9) )
0, /* Don't take the string
from user space */
#endif
str, /* String */
strlen(str)); /* Length */
/*
* ttys were originally hardware devices, which (usually)
* strictly followed the ASCII standard. In ASCII, to move to
* a new line you need two characters, a carriage return and a
* line feed. On Unix, the ASCII line feed is used for both
* purposes - so we can't just use \n, because it wouldn't have
* a carriage return and the next line will start at the
* column right after the line feed.
*
* This is why text files are different between Unix and
* MS Windows. In CP/M and derivatives, like MS-DOS and
* MS Windows, the ASCII standard was strictly adhered to,
* and therefore a newline requirs both a LF and a CR.
*/
#if ( LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,9) )
((my_tty->driver)->write) (my_tty, 0, "\015\012", 2);
#else
((my_tty->driver)->write) (my_tty, "\015\012", 2);
#endif
}
}
static int __init print_string_init(void)
{
print_string("The module has been inserted. Hello world!");
return 0;
}
static void __exit print_string_exit(void)
{
print_string("The module has been removed. Farewell world!");
}
module_init(print_string_init);
module_exit(print_string_exit);
10.2.点亮键盘led灯
在特定情况下,你可能渴望一个简单且直接的方法来与外部世界交流。点亮键盘LED就是如此一个解决方法:这个是一个最直接的方法来吸引注意或者显示一个状态条件。键盘LEDs都表现在硬件上,他们都是可见的,他们不需要任何安装,他们的使用很简单且没什么烦恼,这些与写到tty或者文件相比就很简单了。
下面的源码说明了一个小内核模块,它装载开始点亮一个键盘灯直到它退出的时候熄灭它。
例子10-2.kbleds.c
/*
* kbleds.c - Blink keyboard leds until the module is unloaded.
*/
#include <linux/module.h>
#include <linux/config.h>
#include <linux/init.h>
#include <linux/tty.h> /* For fg_console, MAX_NR_CONSOLES */
#include <linux/kd.h> /* For KDSETLED */
#include <linux/vt.h>
#include <linux/console_struct.h> /* For vc_cons */
MODULE_DESCRIPTION("Example module illustrating the use of Keyboard LEDs.");
MODULE_AUTHOR("Daniele Paolo Scarpazza");
MODULE_LICENSE("GPL");
struct timer_list my_timer;
struct tty_driver *my_driver;
char kbledstatus = 0;
#define BLINK_DELAY HZ/5
#define ALL_LEDS_ON 0x07
#define RESTORE_LEDS 0xFF
/*
* Function my_timer_func blinks the keyboard LEDs periodically by invoking
* command KDSETLED of ioctl() on the keyboard driver. To learn more on virtual
* terminal ioctl operations, please see file:
* /usr/src/linux/drivers/char/vt_ioctl.c, function vt_ioctl().
*
* The argument to KDSETLED is alternatively set to 7 (thus causing the led
* mode to be set to LED_SHOW_IOCTL, and all the leds are lit) and to 0xFF
* (any value above 7 switches back the led mode to LED_SHOW_FLAGS, thus
* the LEDs reflect the actual keyboard status). To learn more on this,
* please see file:
* /usr/src/linux/drivers/char/keyboard.c, function setledstate().
*
*/
static void my_timer_func(unsigned long ptr)
{
int *pstatus = (int *)ptr;
if (*pstatus == ALL_LEDS_ON)
*pstatus = RESTORE_LEDS;
else
*pstatus = ALL_LEDS_ON;
(my_driver->ioctl) (vc_cons[fg_console].d->vc_tty, NULL, KDSETLED,
*pstatus);
my_timer.expires = jiffies + BLINK_DELAY;
add_timer(&my_timer);
}
static int __init kbleds_init(void)
{
int i;
printk(KERN_INFO "kbleds: loading\n");
printk(KERN_INFO "kbleds: fgconsole is %x\n", fg_console);
for (i = 0; i < MAX_NR_CONSOLES; i++) {
if (!vc_cons[i].d)
break;
printk(KERN_INFO "poet_atkm: console[%i/%i] #%i, tty %lx\n", i,
MAX_NR_CONSOLES, vc_cons[i].d->vc_num,
(unsigned long)vc_cons[i].d->vc_tty);
}
printk(KERN_INFO "kbleds: finished scanning consoles\n");
my_driver = vc_cons[fg_console].d->vc_tty->driver;
printk(KERN_INFO "kbleds: tty driver magic %x\n", my_driver->magic);
/*
* Set up the LED blink timer the first time
*/
init_timer(&my_timer);
my_timer.function = my_timer_func;
my_timer.data = (unsigned long)&kbledstatus;
my_timer.expires = jiffies + BLINK_DELAY;
add_timer(&my_timer);
return 0;
}
static void __exit kbleds_cleanup(void)
{
printk(KERN_INFO "kbleds: unloading...\n");
del_timer(&my_timer);
(my_driver->ioctl) (vc_cons[fg_console].d->vc_tty, NULL, KDSETLED,
RESTORE_LEDS);
}
module_init(kbleds_init);
module_exit(kbleds_cleanup);
如果这一章的例子没有满足你debug的需求,那么可能会有一些其他的方法。曾今好奇CONFIG_LL_DEBUG
在做配置内核(menuconfig)的时候有什么好处?如果你激活了接入到那个低等级的串口。但是这个可能没有听起来那么有用处,你能在kernel/printk.c
打补丁或者其他的系统调用来使用printascii
,因此使得可视化追踪你代码做的所有的东西成为可能。如果你发现你自己porting内核到一些新的和之前不支持的结构,这个通常是我们需要实现的第一件事。在netconsole做日志也是值得一试。
当你看到很多能在目标点debugging东西,你就会明白了。debuggin 总是很烦的。加入debug代码能改变情况,这就会使得bug消失(这貌似并不好)。因此,你应该试试保持debug代码最小化并且确保它不会在发布版本的代码中出现。