====================================================================
IPC ---- Signals
Question:
Why is it unsafe to call printf() in a signal handler?
A: A signal handling fucntion could be interrupted in the middle and called again by something eles. So a signal handling function is not just recurisve, but also re-entrant. This means, any system calls that are not re-entrant but are called inside a signal handler are considered unsafe.
For system calls that are safe to call inside a signal handler, refer to page 524 in Book "Linux Programming".
Related system calls:
signal (deprecated)
kill
pause
sigaction (recommended)
sigemptyset
sigdelset
sigfillset
sigismember
sigpending
sigsuspend
Note:
When programming with signals, we should manipulater the signal set carefully.
Sigaction offers additional operations on sigset, eg. masking out a signal.
That's why sigaction is more robust then signal.
struct sigaction {
void (*sa_handler)(int);
void (*sa_sigaction)(int, siginfo_t *, void *);
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
};
The field sa_mask could be used to set the signal mask.
The field sa_flags is used to modify the signal behavior. For example, many system calls are interruptalbe, if some interruptable system call receives a signal, it sets errno to EINTR and returns. If sa_flag is specified as SA_RESART, the interrruptable function will be restarted instead of being interrupted.
Analysis:
1. signal IPC机制是一种很基本的相对比较弱的IPC机制。它的“弱“主要表现在以下几个方面。
第一,如果进程A和B要进行通信,他们之间可以传递的信息,其实就是个数字signum。所以,信息量是很少的。
第二,signum有限,并且预定义了一些行为。如果你改变这些预定义的行为,会使得程序难以理解。这一点,我们可以通过NPTL的设计文档来了解更多。用LWP来实现线程,导致其间用signal通信很不方便,使得整个signal系统破破碎碎的,所以后来才有了NPTL支持。
第三,A和B之间用signal进行通讯,会使得这两个进程耦合成都非常高。首先,A和B之间要现对每个SIGxxx有共同的行为定义;其次,如果A要发送signal给B,它必须要知道B的PID,如果AB不是父子进程的话,这一点有点麻烦。标准c库中,没有类似get_pid_by_name()这种函数。所以,你必须要自己实现这个函数。简单的方法是通过系统的/proc目录进行操作。
第四,无论是获得pid还是本身signal通讯的维护,都需要额外的开销,给编程带来不利。
Example:
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
void handler_usr1(int sig)
{
printf("[%s] is running ... \n", __func__); /* unsafe, should not be called in a practical program */
sleep(10);
printf("[%s] exited \n", __func__);
}
void main()
{
struct sigaction act;
act.sa_handler = handler_usr1;
sigfillset(&act.sa_mask); /* mask out all signals */
act.sa_flags = 0;
sigaction(SIGUSR1, &act, 0);
while (1)
{
printf("[%s] is running ... \n", __func__);
sleep(2);
}
}
要通过进程名字来获得pid,可以通过/proc。如下:
/proc/23697 $ cat status
Name: sigaction_test
State: S (sleeping)
Tgid: 23697
Pid: 23697
PPid: 2309
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
FDSize: 256
Groups: 4 20 24 46 105 119 122 1000
通过字符串,操作,可以获得PID。
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>
#include <signal.h>
#define MAX_INFO_LEN 4096
/**
* function: get_pid_by_name
* @return: -1 if failed, pid of process[name] if succeeded
**/
pid_t get_pid_by_name(const char* name)
{
DIR *dir;
struct dirent *ent;
char *endptr;
char buf[MAX_INFO_LEN];
if ( !(dir=opendir("/proc")) )
{
perror("can not open /proc");
exit(-1);
}
while( (ent = readdir(dir) ) != NULL)
{
long lpid = strtol(ent->d_name, &endptr, 10);
if (*endptr != '\0') /* not a proc dir indicating a running process */
{
continue;
}
memset(buf, 0, sizeof(buf));
snprintf(buf, sizeof(buf), "/proc/%ld/status", lpid);
FILE *fp = fopen(buf, "r");
if (fp) /* not null */
{
if ( fgets(buf, sizeof(buf), fp) != NULL )
{
/* check the second token in the file, i.e. the program name */
char *first = strtok(buf, "\t");
char *second = strtok(NULL, "\n");
printf("[%s]\n", second);
if (!strcmp(second, name))
{
fclose(fp);
closedir(dir);
return (pid_t)lpid;
}
}
fclose(fp);
}//examine "/proc/[lpid]/status"
else
{
printf("open [%s] faild \n", buf);
}
}//while
/* no corresponding process in /proc */
return -1;
}
void main()
{
char *name = "processB";
pid_t pid = get_pid_by_name(name);
if (pid == -1)
{
printf("no process found with name [%s] \n", name);
exit(-1);
}
else
{
printf("sending [%s] SIGUSR1 \n", name);
kill(pid, SIGUSR1);
}
}