可重入性 线程安全 Async-Signal-Safe

Shou哥说遇到问题才能学到东西。

还有半句没说:前提是静心钻研下去,而不是烦躁并避开问题

程序出了问题,用文件作为一个mutex使用共享内存,写了数据就发信号,结果收到信号Qt写的GUI会卡死,看sigqueue的manpage,发现Async-signal-safe,搜索到这篇


    首先,可重入和线程安全是两个并不等同的概念,一个函数可以是可重入的,也可以是线程安全的,可以两者均满足,可以两者皆不满组(该描述严格的说存在漏洞,参见第二条)。

    其次,从集合和逻辑的角度看,可重入是线程安全的子集,可重入是线程安全的充分非必要条件。可重入的函数一定是线程安全的,然过来则不成立。

    第三,POSIX 中对可重入和线程安全这两个概念的定义:
   
    Reentrant Function :

    A function whose effect, when called by two or more threads,is guaranteed to be as if the threads each executed thefunction one after another in an undefined order, even ifthe actual execution is interleaved.

                                                                                                        From IEEE Std 1003.1-2001 (POSIX 1003.1)
                                                                                                                                      -- Base Definitions, Issue 6
    Thread-Safe Function
     
    A function that may be safely invoked concurrently by multiple threads.

   另外还有一个 Async-Signal-Safe的概念

    Async-Signal-Safe Function:
     
    A function that may be invoked, without restriction fromsignal-catching functions. No function is async-signal -safe unless explicitly described as such.

    以上三者的关系为:
   
    Reentrant Function 必然是Thread-Safe FunctionAsync-Signal-Safe Function
 
  
可 重入与线程安全的区别体现在能否在signal处理函数中被调用的问题上,可重入函数在signal处理函数中可以被安全调用,因此同时也是Async- Signal-Safe Function;而线程安全函数不保证可以在signal处理函数中被安全调用,如果通过设置信号阻塞集合等方法保证一个非可重入函数不被信号中断,那 么它也是Async-Signal-Safe Function。

     值得一提的是POSIX 1003.1的System Interface缺省是Thread-Safe的,但不是Async-Signal-Safe的。Async-Signal-Safe的需要明确表示,比如fork ()和signal()。

最后让我们来构想一个线程安全但不可重入的函数:

   假设函数func()在执行过程中需要访问某个共享资源,因此为了实现线程安全,在使用该资源前加锁,在不需要资源解锁。

   假设该函数在某次执行过程中,在已经获得资源锁之后,有异步信号发生,程序的执行流转交给对应的信号处理函数;再假设在该信号处理函数中也需要调用函数 func(),那么func()在这次执行中仍会在访问共享资源前试图获得资源锁,然而我们知道前一个func()实例已然获得该锁,因此信号处理函数阻 塞——另一方面,信号处理函数结束前被信号中断的线程是无法恢复执行的,当然也没有释放资源的机会,这样就出现了线程和信号处理函数之间的死锁局面。

    因此,func()尽管通过加锁的方式能保证线程安全,但是由于函数体对共享资源的访问,因此是非可重入。


评论

# manio  发表于2008-08-02 22:08:28  IP: 10.62.79.*
“可重入是线程安全的充分非必要条件”

class Counter
{
public:
Counter() { n = 0; }

void increment() { ++n; }
void decrement() { --n; }
int value() const { return n; }

private:
int n;
};
这就是一个可重入但线程不安全的例子。

Most Qt classes are reentrant and not thread-safe, to avoid the overhead of repeatedly locking and unlocking a QMutex. For example, QString is reentrant, meaning that you can use it in different threads, but you can't access the same QString object from different threads simultaneously (unless you protect it with a mutex yourself). A few classes and functions are thread-safe; these are mainly thread-related classes such as QMutex, or fundamental functions such as QCoreApplication::postEvent().

=================================================

先上定义吧,POSIX对它们的定义 分别是:

Reentrant Function

A function whose effect, when called by two or more threads, is guaranteed to be as if the threads each executed the function one after another in an undefined order, even if the actual execution is interleaved.

Thread-Safe

A function that may be safely invoked concurrently by multiple threads. Each function defined in the System Interfaces volume of IEEE Std 1003.1-2001 is thread-safe unless explicitly stated otherwise.

Async-Signal-Safe Function

A function that may be invoked, without restriction, from signal-catching functions. No function is async-signal-safe unless explicitly described as such.

可重入我们都清楚,顾名思义,就是可以重新进入,进一步讲就是,用相同的输入,每次调用函数一定会返回相同的结果。这就是可重入。wikipedia上 有更严谨的定义:

* Must hold no static (global) non-constant data.
* Must not return the address to static (global) non-constant data.
* Must work only on the data provided to it by the caller.
* Must not rely on locks to singleton resources.
* Must not call non-reentrant computer programs or routines.

然后是线程安全 ,从定义上看,它仅要求了可以安全地被线程并发执行。这是一个相对较低的要求,因为它内部可以访问全局变量或静态变量,不过需要加锁,也就是说,只要是在线程可控之中的,每次调用它返回不同的结果也没关系。到这里我们可以看出:可重入函数一定是线程安全的,而反之未必。 wikipedia上也写道:

Therefore, reentrancy is a more fundamental property than thread-safety and by definition, leads to thread-safety: Every reentrant function is thread-safe, however, not every thread-safe function is reentrant.

例子,有很多,最出名的莫过于strtok(3),我们认识可重入这个概念就是从它开始的,它内部适用了静态变量,显然是不可重入的(它的可重入版是strtok_r(3))。其次应该是malloc(3),嘿嘿,其实也很明显,我就不多说了。但是,strtok(3)不是 线程安全的,而malloc(3)是。

还有一个概念我们不常碰到,那就是异步信号安全,它其实也很简单,就是一个函数可以在信号处理函数中被安全地调用。看起来它似乎和线程安全类似,其 实不然,我们知道信号是异步产生的,但是,信号处理函数是打断主函数(相对于信号处理函数)后执行,执行完后又返回主函数中去的。也就是说,它不是 并发的!

一个函数,它访问了全局变量,那么它就是不可重入的,不过我们可以把它变成线程安全的,加上锁就可以,但是这种方法并不会把它变成异步信号安全的, 而几乎可以肯定的是,使用了锁的一定不是信号安全的(除非屏蔽了信号,显然),信号完全可以在锁加上后解开前到来,然后就很可能形成死锁…… 这里 有个很好的例子。所以,可重入的函数也一定是异步信号安全的,而反之未必。可以参考IBM上一篇不错的文章 。

关于异步信号安全的函数列表可以参考man 7 signal ;关于 线程安全的函数列表可以参考APUE第12.5节 ;关于可重入函数列表,可参考APUE第10.6节 。另请参阅 。

 

==========================================================================

信号 signal handler是如何造成死锁的, 写得很清楚

Signals are delivered asynchronously, much like interrupts, and so there are a great deal of restrictions placed on the code which runs. Many of these restrictions are much like ThreadSafety , in that you have to account for the fact that your signal handler could run at any moment while other code is running, causing weird problems.

Here is an example of code which is safe:

 

volatile int x = 0;

void handler(int signal) {
x++;
}

int main(void) {
signal(SIGHUP, handler);
while(1) {
sleep(1);
printf("x = %d/n", x);
}
}

Just like with threads, if the signal handler is writing to a primitive and the main function is only reading, everything is safe. Well, as long as no one else sends a SIGHUP: if that were to happen, then the above example has the potential of a race condition.

Now for a broken example:

 

volatile int x = 0;

void handler(int signal) {
x++;
}

int main(void) {
signal(SIGHUP, handler);
while(1) {
sleep(1);
x++;
printf("x = %d/n", x);
}
}

This is broken for the same reason that it would be broken with two threads. The operation x++ is not necessarily atomic, but can be divided into several pieces:

 

read x from memory into a register
increment the register
write x back into memory

If the signal handler is invoked while the main function is in the middle of this update process, an increment will be lost. Worse, the read and write of x is not guaranteed to be atomic (for that we would need to use sig_atomic_t ). Let's try to fix this by adding a simple lock. We'll assume that we have a TestAndSet? function which atomically sets a variable to 1 and returns its old value. Then we write our new program like this:

 

volatile sig_atomic_t lock = 0;
volatile int x = 0;

void handler(int signal) {
while(TestAndSet(&lock)) ;
x++;
lock = 0;
}

int main(void) {
signal(SIGHUP, handler);
while(1) {
sleep(1);

while(TestAndSet(&lock)) ;
x++;
lock = 0;

printf("x = %d/n", x);
}
}

If we were working with threads, then everything would work as expected. But with signals, this not only fails to solve the problem, but it actually makes it worse . Why?

Threads run more or less simultaneously. On a multi-processor system they might really run simultaneously, but even on a single-processor system, the OS makes sure that every thread gets a chance to run. So while a thread might get stuck in the while loop for a while, eventually the other thread will get a chance to run, and it will unlock the lock.

Signals, however, don't run simultaneously. While the signal handler is running, the main program is completely stopped. If the handler is invoked while the main program has locked the lock, the handler will spin forever waiting for the lock to be unlocked, while at the same time the main program is stuck waiting for the handler to end. Deadlock! If this situation ever happens, your program will completely freeze. So now we see that signal-safe code is even more restricted than thread-safe code.

How do we fix it? For our example program, we can fix it by adding an auxiliary variable, like so:

 

volatile sig_atomic_t lock = 0;
volatile int x = 0;
volatile int y = 0;

void handler(int signal) {
if(!lock)
x++;
else
y++;
}

int main(void) {
int temp;
signal(SIGHUP, handler);
while(1) {
sleep(1);

lock = 1
x++;
lock = 0;

temp = y;
y = 0;

lock = 1;
x += temp;
lock = 0;

printf("x = %d/n", x);
}
}

Here, we have a sort of asymmetric lock. Instead of waiting for the lock to be free, the signal handler uses it to decide which variable to increment. The main program then manipulates the lock to ensure that it can always reliably read or modify the variable it's interested in while it performs the sum of the two.

But you say, this counter is nonsense. I just want to print out a notice that my signal was received, nothing more. I'll just write this code:

 

void handler(int signal) {
printf("got signal %d/n", signal);
}

This code is fine, right? None of this nonsense with locks or counters or anything. No! It's not safe because you're calling printf() , and who knows what it does inside. In fact printf() probably does some locking internally on the file stream, so if the signal is delivered while your program is in the middle of another call to printf() , kaboom, deadlock!

What do we do? In this case, we'll have to do everything manually using functions which we know to be safe. In this case, we take advantage of the fact that write() is safe:

 

void handler(int signal) {
char text[] = "got signal 00/n";
text[11] += (signal / 10) % 10;
text[12] += signal % 10;
write(STDOUT_FILENO, text, 14);
}

The key word is "async-signal safe". If you see a function documented as being "thread safe", you know that you can call it simultaneously from multiple threads. If you see a function documented as being "async-signal safe", then you know that you can call it from a signal handler without blowing up your program. A fairly complete list of signal safe functions can be found in man sigaction .

The trick is that a lot of code is not async-signal safe. Since it's so much harder to write, very little code is async-signal safe. For example, objc_msgSend() uses locks and so is not async-signal safe, meaning that you cannot use any Objective-C code in a signal handler. You can't use Objective-C, you can't call malloc() , you can't touch CoreFoundation or most of libc.

How do you get anything accomplished in a signal handler, then? The best bet is usually to do as little as possible in the handler itself, but somehow signal the rest of your program that something needs to be done. For example, let's say you want to reload your configuration file when sent a SIGHUP. If your program never blocks for long, we could write our program like this:

 

volatile sig_atomic_t gReloadConfigFile = 0;

void handler(int signal) {
gReloadConfigFile = 1;
}
...
while(!done) {
DoPeriodicProcessing();
if(gReloadConfigFile) {
gReloadConfigFile = 0;
ReloadConfigFile();
}
}

What if your program often blocks on input or a socket or something of that nature? All is not lost, however, because delivering a signal will automatically unblock your program if it's in the middle of a blocking read() , select() , or similar. You could write your program like this:

 

volatile sig_atomic_t gReloadConfigFile = 0;

void handler(int signal) {
gReloadConfigFile = 1;
}
...
while(!done) {
if(gReloadConfigFile) {
gReloadConfigFile = 0;
ReloadConfigFile();
}
// X
select(...);
ProcessData(...);
}

Looks good, right? If your program is blocked in the select() , the signal will dislodge it and the app will reload its configuration file. If the program is busy processing data when it comes in, then the configuration file will be reloaded before you get back to the select() . But... you guessed it! This code isn't completely safe!

The key is the // X comment. If the signal is delivered in that spot, after the check but before entering the select() , the select() will still block, and the configuration file won't be reloaded until some input arrives, which could be much later.

What do we do about this? The best way is to use a signaling mechanism which can be integrated into the select() call, namely a pipe. You can create a pipe using the pipe() system call, then read from it in the select() and write to it in your signal handler. By using the pipe instead of a global variable, you ensure that the select() never blocks when a condition needs to be taken care of. Implementation of this scheme is left as an exercise to the reader.

If you do use this scheme, you have to be careful. (I can hear you saying, "What now???") Make sure to set your pipe to be non-blocking, otherwise your pipe could fill up in the signal handler before its emptied in your main program, and you'll deadlock. If the write fails because the pipe is full then that's fine anyway, because that means there's already a signal sitting in the pipe ready to be received the next time you go through your app's main loop.

你可能感兴趣的:(thread,function,Scheme,System,Signal,locking)