6.087 Practical Programming in C, lec12

Multithreading and concurrency

Preliminaries: Parallel computing

• Parallelism: Multiple computations are done simultaneously.

• Instruction level (pipelining)

• Data parallelism (SIMD)

• Task parallelism (embarrassinglyparallel)

• Concurrency: Multiple computations that may be done inparallel.

• Concurrency vs. Parallelism

concurrencyparallelism是不同的,可以参考这篇英文文章。拿两个简单的计算任务T1T2为例,简单地说,concurrencyT1T2的执行顺序可以不确定,而parallelismT1T2可以同时在多个CPU上运算。

Process vs. Threads

• Process: An instance of a program that is being executed inits own address space. In POSIX systems, each process maintains itsown heap, stack, registers, file descriptors etc.

Communication:

• Shared memory

• Network

• Pipes, Queues

• Thread: A light weight process that shares its address spacewith others.In POSIX systems, each thread maintains the bare essentials: registers, stack, signals.

Communication:

• shared address space.

线程必须有自己的栈,因为线程的执行顺序不是线性的,如果用进程栈的话就会导致混乱。线程不需要自己的堆和文件描述符,我想是因为相对而言堆是静态的,即不会用堆的结构来控制程序的执行结构。文件描述符是将文件映射到进程的虚拟内存中所做的标记,因此也是全局性质的。

Multithreaded concurrency

Serial execution:

• All our programs so far has had asingle thread of execution: main thread.

• Program exits when the main thread exits.

Multithreaded:

• Program is organized as multipleand concurrent threads of execution.

• The main thread spawns multiple threads.

• The thread may communicate with one another.

• Advantages:

• Improves performance

• Improves responsiveness

• Improves utilization

• less overhead compared to multiple processes

多线程的程序最大的好处就是使不同类型的线程交叉运行,利用人类与机器的巨大速度差异,使人有一种程序并发执行的假象。这个在图形用户界面中和数据的读取中使用的最为成功,其实也就是MVC中的MV。现在想想,世界上很多东西都有一种神奇的联系,就像数学和机器,艺术和数学等。

使用进程来协作完成一件任务,虽然可以使内存获得更有效的保护(独立的虚拟内存),但同时也存在着这些缺点:1.进程之间的切换代价较高;2.操作系统可以有效控制的进程数是有限的;3.多进程共享的数据在同步代价较大。

Multithreaded programming

Even in C, multithread programming may be accomplished in several ways

• Pthreads: POSIX C library.

• OpenMP

• Intel threading building blocks

• Cilk (from CSAIL!)

• Grand central despatch

• CUDA (GPU)

• OpenCL (GPU/CPU)

学习曲线似乎是一个比一个高,我正打算学OpenMP,听说比较好上手且效率提升效果还行。

Not all code can be made parallel

6.087 Practical Programming in C, lec12_第1张图片

右侧的循环不能并行化是因为存在数据依赖,我称之为数据递归。存在强依赖的递归程序(也就是咱们平时所说的递归程序)很难实现并行化。

Pthread

API:

Thread management: creating, joining, attributes

pthread_

Mutexes:create, destroy mutexes

pthread_mutex_

Condition variables: create, destroy, wait, signal

pthread_cond_

Synchronization: read/write locks and barriers

pthread_rwlock_, pthread_barrier_

API:

#include <pthread.h>

gcc−Wall −O0 −o <output> file.c −pthread (no −l prefix)

Pthread主要使用锁来控制并发,跟数据库有些类似,这些锁主要有互斥锁,条件锁等。

Creating threads

int pthread_create(pthread_t ∗thread, const pthread_attr_t ∗attr, void ∗(∗start_routine)(void ∗), void ∗ arg);

• creates a new thread with the attributes specified by attr.

• Default attributes are used if attr is NULL.

• On success, stores the thread it into thread

• calls function start_routine(arg) on a separate thread of execution.

• returns zero on success, non-zeroon error.

void pthread_exit(void ∗value_ptr);

• called implicitly when thread function exits.

• analogous to exit().

线程中的方法参数和返回值为什么都void类型呢?参数很耗理解,因为只有void可以方便地转换为其它类型。返回值是void,也就是没有返回值,我想是因为每个线程都是相对独立的,他们通过共享内存(堆?)来进行通信,而且其程序运行在自己的栈中,因此不需要也不能通过返回值进行通信。Thread的停止条件有这几个:1.调用pthread_exit2.start_routine返回;3.线程被取消pthread_cancel

Synchronization: joining

6.087 Practical Programming in C, lec12_第2张图片

int pthread_join(pthread_t thread, void ∗∗value_ptr);

• pthread_join() blocks the calling thread until the specified thread terminates.

• If value_ptr is not null, it will contain the return status of the called thread

Other ways to synchronize: mutex,condition variables

如果线程T1 join线程T2,那么T1将等待T2结束后才开始执行。线程是先创建然后再设置join,因此需要将attr设置为PTHREAD_CREATE_JOINABLE,防止线程在创建时就运行。

Mutex(互斥锁)

• Mutex (mutual exclusion) acts as a "lock" protecting access to the shared resource.

• Only one thread can "own" the mutex at a time. Threads must take turns to lock the mutex.

int pthread_mutex_destroy(pthread_mutex_t ∗mutex);

int pthread_mutex_init(pthread_mutex_t ∗mutex, const pthread_mutex attr_t ∗attr);

thread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER ;

• pthread_mutex_init() initializes a mutex. If attributes are NULL, default attributes are used.

• The macro PTHREAD_MUTEX_INITIALIZER can be used to initialize static mutexes.

• pthread_mutex_destroy() destroys the mutex.

• Both function return return 0 on success, non zero on error

int pthread_mutex_lock(pthread_mutex_t ∗mutex);

int pthread_mutex_trylock(pthread_mutex_t ∗mutex);

int pthread_mutex_unlock(pthread_mutex_t ∗mutex);

• pthread_mutex_lock() locks the given mutex. If the mutex is locked, the function is blocked until it becomes available.

• pthread_mutex_trylock() is the non-blocking version. If the mutex is currently locked the call will return immediately.

• pthread_mutex_unlock() unlocks the mutex.

互斥锁感觉没什么好说的,相关的原理在数据库系统原理里面讲了很多。不知为什么,我不是很喜欢这种方法,总感觉很复杂。

Condition variables

Sometimes locking or unlocking is based on a run-time condition(examples?). Without condition variables, program would have to poll the variable/condition continuously.

Consumer:

(a) lock mutex on global item variable

(b) wait for (item>0) signal from producer (mutex unlocked automatically).

(c) wake up when signalled (mutex locked again automatically), unlock mutex and proceed.

Producer:

(1) produce something

(2) Lock global item variable, update item

(3) signal waiting (threads)

(4) unlock mutex

int pthread_cond_wait(pthread_cond_t ∗cond, pthread_mutex_t ∗mutex);

• blocks on a condition variable.

• must be called with the mutex already locked otherwise behavior undefined.

• automatically releases mutex

• upon successful return, the mutex will be automatically locked again.

intpthread_cond_broadcast(pthread_cond_t ∗cond);

intpthread_cond_signal(pthread_cond_t ∗cond);

• unblocks threads waiting on a condition variable.

• pthread_cond_broadcast() unlocks all threads that are waiting.

• pthread_cond_signal() unlocks one of the threads that are waiting.

• both return 0 on success, non zero otherwise.

MFC中的文档试—视图结构很类似,也是I/O中比较常见的一种情况。感觉这个像是mutex的一种应用模式。



你可能感兴趣的:(6.087 Practical Programming in C, lec12)