Work-Sharing 结构的种类:
注意: 关于Fortran 的workshare 结构稍后讨论
DO / for - 在一组线程中共享循环中的计数器 (iteration). 表示一种 "数据并行处理". |
SECTIONS - 将工作分成独立,不关联的片段。 每个片段被一个线程执行。用来实现一种 “函数并行处理” |
SINGLE - 串行化代码段 |
|
|
|
规则:
DO / for 指令
目的:
格式:
Fortran |
!$OMP DO [clause ...] SCHEDULE (type [,chunk]) ORDERED PRIVATE (list) FIRSTPRIVATE (list) LASTPRIVATE (list) SHARED (list) REDUCTION (operator | intrinsic : list) COLLAPSE (n) do_loop !$OMP END DO [ NOWAIT ] |
C/C++ |
#pragma omp for [clause ...] newline schedule (type [,chunk]) ordered private (list) firstprivate (list) lastprivate (list) shared (list) reduction (operator: list) collapse (n) nowait for_loop |
子句:
STATIC
循环按照chunk的值被等分并静态赋予线程。如果没有指定chunk,那么将均分此循环。
DYNAMIC
循环按照chunk的值被等分并动态赋予线程。如果一个线程完成了一个分支,那么它将继续运行下一个分支。默认的chunk值为1.
GUIDED
如果chunk为1,每个分支的值为剩余的循环次数除以线程数量,减少到1.如果chunk的值为K(比1大),每块分支也以同样的方式分配迭代次数,只是每块的迭代次数不小于K。默认chunk的值为1.
RUNTIME
调度的策略依赖于运行时的环境变量OMP_SCHEDULE.为这个子句指定chunk的值是非法的。
AUTO
调度策略依赖于编译器或者运行时系统
规则:
Fortran - DO Directive Example PROGRAM VEC_ADD_DO INTEGER N, CHUNKSIZE, CHUNK, I PARAMETER (N=1000) PARAMETER (CHUNKSIZE=100) REAL A(N), B(N), C(N) ! Some initializations DO I = 1, N A(I) = I * 1.0 B(I) = A(I) ENDDO CHUNK = CHUNKSIZE !$OMP PARALLEL SHARED(A,B,C,CHUNK) PRIVATE(I) !$OMP DO SCHEDULE(DYNAMIC,CHUNK) DO I = 1, N C(I) = A(I) + B(I) ENDDO !$OMP END DO NOWAIT !$OMP END PARALLEL END |
·
C / C++ - for Directive Example #include <omp.h> #define CHUNKSIZE 100 #define N 1000 main () { int i, chunk; float a[N], b[N], c[N]; /* Some initializations */ for (i=0; i < N; i++) a[i] = b[i] = i * 1.0; chunk = CHUNKSIZE; #pragma omp parallel shared(a,b,c,chunk) private(i) { #pragma omp for schedule(dynamic,chunk) nowait for (i=0; i < N; i++) c[i] = a[i] + b[i]; } /* end of parallel section */ } |
目的:
格式:
Fortran |
!$OMP SECTIONS [clause ...] PRIVATE (list) FIRSTPRIVATE (list) LASTPRIVATE (list) REDUCTION (operator | intrinsic : list) !$OMP SECTION block !$OMP SECTION block !$OMP END SECTIONS [ NOWAIT ] |
C/C++ |
#pragma omp sections [clause ...] newline private (list) firstprivate (list) lastprivate (list) reduction (operator: list) nowait { #pragma omp section newline structured_block #pragma omp section newline structured_block } |
子句:
问答:
|
如果线程数量和SECTIONS的数量不一样会怎样?比SECTIONS多呢?比SECTIONS少呢?
|
|
|
|
哪条线程执行哪块SECTION? |
规则:
Fortran - SECTIONS Directive Example PROGRAM VEC_ADD_SECTIONS INTEGER N, I PARAMETER (N=1000) REAL A(N), B(N), C(N), D(N) ! Some initializations DO I = 1, N A(I) = I * 1.5 B(I) = I + 22.35 ENDDO !$OMP PARALLEL SHARED(A,B,C,D), PRIVATE(I) !$OMP SECTIONS !$OMP SECTION DO I = 1, N C(I) = A(I) + B(I) ENDDO !$OMP SECTION DO I = 1, N D(I) = A(I) * B(I) ENDDO !$OMP END SECTIONS NOWAIT !$OMP END PARALLEL END |
·
C / C++ - sections Directive Example #include <omp.h> #define N 1000 main () { int i; float a[N], b[N], c[N], d[N]; /* Some initializations */ for (i=0; i < N; i++) { a[i] = i * 1.5; b[i] = i + 22.35; } #pragma omp parallel shared(a,b,c,d) private(i) { #pragma omp sections nowait { #pragma omp section for (i=0; i < N; i++) c[i] = a[i] + b[i]; #pragma omp section for (i=0; i < N; i++) d[i] = a[i] * b[i]; } /* end of sections */ } /* end of parallel section */ } |
目的:
格式:
Fortran |
!$OMP WORKSHARE structured block !$OMP END WORKSHARE [ NOWAIT ] |
规则:
Fortran - WORKSHARE Directive Example PROGRAM WORKSHARE INTEGER N, I, J PARAMETER (N=100) REAL AA(N,N), BB(N,N), CC(N,N), DD(N,N), FIRST, LAST ! Some initializations DO I = 1, N DO J = 1, N AA(J,I) = I * 1.0 BB(J,I) = J + 1.0 ENDDO ENDDO !$OMP PARALLEL SHARED(AA,BB,CC,DD,FIRST,LAST) !$OMP WORKSHARE CC = AA * BB DD = AA + BB FIRST = CC(1,1) + DD(1,1) LAST = CC(N,N) + DD(N,N) !$OMP END WORKSHARE NOWAIT !$OMP END PARALLEL END |
目的:
格式:
Fortran |
!$OMP SINGLE [clause ...] PRIVATE (list) FIRSTPRIVATE (list) block !$OMP END SINGLE [ NOWAIT ] |
C/C++ |
#pragma omp single [clause ...] newline private (list) firstprivate (list) nowait structured_block |
子句:
规则:
Fortran - PARALLEL DO Directive Example PROGRAM VECTOR_ADD INTEGER N, I, CHUNKSIZE, CHUNK PARAMETER (N=1000) PARAMETER (CHUNKSIZE=100) REAL A(N), B(N), C(N) ! Some initializations DO I = 1, N A(I) = I * 1.0 B(I) = A(I) ENDDO CHUNK = CHUNKSIZE !$OMP PARALLEL DO !$OMP& SHARED(A,B,C,CHUNK) PRIVATE(I) !$OMP& SCHEDULE(STATIC,CHUNK) DO I = 1, N C(I) = A(I) + B(I) ENDDO !$OMP END PARALLEL DO END |
·
C / C++ - parallel for Directive Example #include <omp.h> #define N 1000 #define CHUNKSIZE 100 main () { int i, chunk; float a[N], b[N], c[N]; /* Some initializations */ for (i=0; i < N; i++) a[i] = b[i] = i * 1.0; chunk = CHUNKSIZE; #pragma omp parallel for / shared(a,b,c,chunk) private(i) / schedule(static,chunk) for (i=0; i < n; i++) c[i] = a[i] + b[i]; } |
目的:
Format:
Fortran |
!$OMP TASK [clause ...] IF (scalar expression) UNTIED DEFAULT (PRIVATE | FIRSTPRIVATE | SHARED | NONE) PRIVATE (list) FIRSTPRIVATE (list) SHARED (list) block !$OMP END TASK |
C/C++ |
#pragma omp task [clause ...] newline if (scalar expression) untied default (shared | none) private (list) firstprivate (list) shared (list) structured_block |
子句和限制: