• OS implements scheduler – determines which threads execute
• Scheduling may execute threads in arbitrary order
• Without proper synchronization, code can execute non-deterministically
• Suppose we have two threads: 1 reads a variable, 2 modifies that variable
• Scheduler may execute 1, then 2, or 2 then 1
• Non-determinism creates a race condition – where the behavior/result depends on the order of execution
线程是由操作系统来控制的,而不是程序自身。尽管进程也是由操作系统控制,在执行的时候会中断,但因为进程有自己的内存空间,因此一般不会发生线程中共享资源访问的问题。Racecondition是指结果受执行顺序影响的情况。
Consider the following program race.c:
unsigned int cnt =0;
void ∗count (void ∗arg){/∗thread body∗/
int i;
for ( i = 0 ; i < 100000000; i ++)
cnt++;
return NULL;
}
int main(void){
pthread_t tids[4];
int i;
for (i = 0; i < 4; i++)
pthread_create(&tids[i], NULL, count, NULL);
for(i = 0; i < 4; i++)
pthread_join(tids[i], NULL);
printf("cnt=%u\n", cnt);
return 0;
}
What is the value of cnt?
[Bryant and O’Halloran. Computer Systems: A Programmer’sPerspective. Prentice Hall, 2003.]
Ideally, should increment cnt 4 × 100000000 times, so cnt =400000000. However, running our code gives:
athena% ./race.o
cnt=137131900
athena% ./race.o
cnt=163688698
athena% ./race.o
cnt=169695163
So, what happened?
• C not designed for multithreading
• No notion of atomic operations in C
• Increment cnt++; maps to three assembly operations:
1. load cnt into a register
2. increment value in register
3. save new register value as new cnt
• So what happens if thread interrupted in the middle?
• Race condition!
上述程序结果令人吃惊,一个重要原因是我本以为++是原子操作,现在发现++并不是原子操作,看来要真正理解程序,还是要学编译原理啊。如何解决这个问题?很简单,加锁使其成为原子操作皆可。加锁看起来是一种简便的方式,随程序的复杂,锁也带来了很多问题,尤其是手动来控制加锁、解锁的过程,更容易出错。上面也提到,问题的一个重要根源就是C在设计时并没有考虑多线程的问题。而Java等变成语言提供了synchronized关键字等设施来自动地解决这个问题,大大减轻了程序员的负担。
Let’s fix our code:
pthread_mutex_t mutex;
unsigned int cnt = 0;
void ∗count (void ∗arg){/∗thread body∗/
int i;
for(i = 0; i < 100000000; i++){
pthread_mutex_lock(&mutex);
cnt++;
pthread_mutex_unlock (&mutex ) ;
}
return NULL;
}
int main ( void ) {
pthread_t tids[4];
int i;
pthread_mutex_init(&mutex, NULL);
for (i = 0; i < 4; i++)
pthread_create(&tids[i], NULL, count, NULL);
for(i = 0; i < 4; i++)
pthread_join(tids[i], NULL);
pthread_mutex_dest roy (&mutex ) ;
printf("cnt=%u\n", cnt);
return 0;
}
如上面的代码所示,对读写公共数据的操作加锁,即可解决这个问题。当然,加解锁自身必须是原子操作,否则加解锁的过程也是racecondition。
• Note that new code functions correctly, but is much slower
• C statements not atomic–threads may be interrupted atassembly level, in the middle of a C statement
• Atomic operations like mutex locking must be specified asatomic using special assembly instructions
• Ensure that all statements accessing/modifying sharedvariables are synchronized
我觉得速度变慢的原因主要在线程切换上,加解锁的过程是原子操作,不会费很多时间。C中的statement不具备原子性是很多多线程程序出问题的一个重要原因,我想应该有人发明一种线程安全的C变种吧。要在读写共享变量的statement里面使用同步机制,这个在java里面是强制要求的,否则会报警。
• Semaphore – special nonnegative integer variable s,initially 1, which implements two atomic operations:
• P(s)– wait until s > 0, decrement s and return
• V(s)– increment s by 1, unblocking a waiting thread
• Mutex – locking calls P(s) and unlocking calls V(s)
• Implemented in <semaphore.h>, part of library rt, not pthread
Semaphore特别适合生产者 –消费者模型,生产者可以一直调用V(s)来生产数据,消费者则由P(s)控制,只有当计数大于0即有“产品”时,才可以运行进行消费。
• Initialize semaphore to value:
int sem_init(sem_t ∗sem,int pshared, unsigned int value);
• Destroy semaphore:
int sem_destroy(sem_t ∗sem);
• Wait to lock, blocking:
int sem_wait(sem_t ∗sem);
• Try to lock, returning immediately (0 if now locked, −1otherwise):
int sem_trywait(sem_t ∗sem);
• Increment semaphore, unblocking a waiting thread:
int sem_post(sem_t ∗sem);
• Use a semaphore to track available slots in shared buffer
• Use a semaphore to track items in shared buffer
• Use a semaphore/mutex to make buffer operations synchronous
使用semaphore,生产者–消费者模型显得更为清晰,说实话,一直没有看明白前面使用锁来实现的生产者—消费者模型,但这里就很容易看懂了。使用mutex是为了实现操作原子化,而slot和item锁的顺序是为了使生产和消费交替进行。
• Function is thread safe if it always behaves correctly when called from multiple concurrent threads
• Unsafe functions fall in several categories:
• accesses/modifies unsynchronized shared variables
• functions that maintain state using static variables – like rand(), strtok()
• functions that return pointers to static memory – like gethostbyname()
• functions that call unsafe functions may be unsafe
记得在学习Java时一个重要的概念就是线程安全,听到这个词就会有如释重负的感觉。其实线程安全的方法比较好识别,只要自身不使用共享数据,然后调用的程序也都线程安全就好了。从中可以看出,线程安全像GPL,有很强的“传递性”,因此一个好的做法是将这些涉及到共享数据操作的部分封装起来,一个经典的方法就是数据库了。
• Reentrant function – does not reference any shared data when used by multiple threads
• All reentrant functions are thread-safe (are all thread-safefunctions reentrant?)
• Reentrant versions of many unsafe C standard library
functions exist:
Unsafe function Reentrant version
rand(): rand_r()
strtok(): strtok_r()
asctime(): asctime_r()
ctime(): ctime_r()
gethostbyaddr(): gethostbyaddr_r()
gethostbyname(): gethostbyname_r()
inet_ntoa(): (none)
localtime(): localtime_r()
随技术的不断发展,很多之前没有为线程考虑的类库都有了线程安全的方法,因此在进行程序设计时要首先考虑线程安全的方法。我想不是所有的线程安全方法都是reentrant方法,例如那些含有只读数据的方法应该也是线程安全的。还有一些方法通过内部访问机制也可以实现数据安全,如原子化操作等。
To make your code thread-safe:
• Use synchronization objects around shared variables
• Use reentrant functions
• Use synchronization around functions returning pointers to shared memory (lock-and-copy):
1. lock mutex for function
2. call unsafe function
3. dynamically allocate memory for result; (deep) copy result into new memory
4. unlock mutex
• Defeating deadlock extremely difficult in general
• When using only mutexes, can use the “mutex lock ordering rule” to avoid deadlock scenarios:
A program is deadlock-free if, for each pair of mutexes (s, t) in the program, each thread that uses both s and t simultaneously locks them in the same order.
• Socket – abstraction to enable communication across a network in a manner similar to fileI/O
• Uses header <sys/socket.h>(extension of C standard library)
• Network I/O, due to latency,usually implemented asynchronously, using multithreading
• Sockets use client/server model of establishing connections
网络通信是线程的一个重要应用领域。网络传输是不稳定的和昂贵的,因此需要通过异步来优化程序,多线程是实现这种异步的最佳选择。网络传输是很复杂的,涉及到很多协议,为了实现封装和隔离,C和UNIX将其抽象为文件,通过文件描述符来代表网络链接,使网络通信成为本地文件操作。
• Create a socket, getting the file descriptor for that socket:
int socket(int domain, int type, int protocol);
• domain– use constant AF_INET, so we’re using the internet; might also use AF_INET6 for IPv6 addresses
• type– use constant SOCK_STREAM for connection-based protocols likeTCP/IP; use SOCK_DGRAM for connectionless datagram protocols like UDP (we’ll concentrate on the former)
• protocol– specify 0 to use default protocol for the socket type (e.g. TCP)
• returns nonnegative integer for file descriptor, or −1 if couldn’t create socket
• Don’t forget to close the file descriptor when you’re done!
返回值为fd,此时这个socket就可以作为一个本地文件来操作了,尽管文件的内容可能在通过网络连接的另外一台机器上。
• Using created socket, we connect to server using:
int connect(int fd , struct sock addr ∗addr, int addr_len);
• fd – the socket’s file descriptor
• addr – the address and port of the server to connect to; for internet addresses, cast data of type struct sockaddr_in, which has the following members:
• sin_family – address family; always AF_INET
• sin_port – port in network byte order (use htons() to convert to network byte order)
• sin_addr.s_addr – IPaddress in network byte order (use htonl() to convert to network byte order)
• addr_len – size of sockaddr_in structure
• returns 0 if successful
• Using created socket, we bind to the port using:
int bind(int fd , struct sockaddr ∗addr, int addr_len);
• fd, addr, addr_len – same as for connect()
• note that address should be IP address of desired interface (e.g. eth0)on local machine
• ensure that port for server is not taken (or you may get “address already in use” errors)
• return 0 if socket successfully bound to port
• Using the bound socket, start listening:
int listen(int fd, int backlog);
• fd– bound socket file descriptor
• backlog– length of queue for pending TCP/IP connections; normally set to a large number, like 1024
• returns 0 if successful
• Wait for a client’s connection request (may already be queued):
int accept(int fd , struct sockaddr ∗addr, int ∗addr_len);
• fd– socket’s file descriptor
• addr– pointer to structure to be filled with client address info(can be NULL)
• addr_len– pointer to int that specifies length of structure pointed to by addr; on output, specifies the length of the stored address (stored address may be truncated if bigger than supplied structure)
• returns(nonnegative) file descriptor for connected client socket if successful
• Send data using the following functions:
int write(int fd, const void ∗buf, size_t len);
int send(int fd, const void ∗buf, size_t len, int flags);
• Receive data using the following functions:
int read(int fd, void ∗buf, size_t len);
int recv(int fd, void ∗buf, size_t len, int flags);
• fd– socket’s file descriptor
• buf– buffer of data to read or write
• len– length of buffer in bytes
• flags– special flags; we’ll just use 0
• all these return the number of bytes read/written (if successful)
这里的fd应该就是accept中返回的fd。
• Up to now, all I/O has been synchronous – functions do not return until operation has been performed
• Multithreading allows us to read/write a file or socket without blocking our main program code(just put I/O functions in a separate thread)
• Multiplexed I/O – use select() or poll() with multiple file descriptors
• To check if multiple files/sockets have data to read/write/etc: (include <sys/select.h>)
int select ( int nfds, fd_set ∗readfds, fd_set ∗writefds, fd_set ∗errorfds, struct timeval ∗timeout);
• nfds– specifies the total range of file descriptors to be tested (0 up to nfds−1)
• readfds, writefds, errorfds – if not NULL, pointer to set of file descriptors to be tested for being ready to read, write, or having an error; on output, set will contain a list of only those file descriptors that are ready
• timeout– if no file descriptors are ready immediately, maximum time to wait for a file descriptor to be ready
• returns the total number of set file descriptor bits in all the sets
• Note that select() is ablocking function
• fd_set – a mask for file descriptors; bits are set (“1”) if in the set, or unset (“0”) otherwise
• Use the following functions to set up the structure:
• FD_ZERO(&fdset) – initialize the set to have bits unset for all file descriptors
• FD_SET(fd,&fdset) – set the bit for file descriptor fd in the set
• FD_CLR(fd,&fdset) – clear the bit for file descriptor fd in the set
• FD_ISSET(fd,&fdset) – returns nonzero if bit for file descriptor fd isset in the set
select像是个霸道的总管,每次检查是否有通信时,不仅阻塞通信,而且要检查所有的通信,实在是厉害的很,可惜不是很灵活。
• Similar to select(), but specifies file descriptors differently: (include <poll.h>)
int poll(struct pollfd fds[], nfds_t nfds, int timeout);
• fds– an array of poll fd structures, whose members fd, events, and revents, are the file descriptor, events to check (OR-ed combination of flags like POLLIN, POLLOUT, POLLERR, POLLHUP), and result of polling with that file descriptor for those events,respectively
• nfds– number of structures in the array
• timeout– number of milliseconds to wait; use 0 to return immediately, or −1 to block indefinitely
相比select,poll就灵活的多,可以检查指定的socket。