OpenMP for Fortran


OpenMP for Fortran


  • OpenMP Directive

     

  • Syntax of OpenMP compiler directive for Fortran:
     !$OMP  DirectiveName Optional_CLAUSES...
    
       ...
    
       ... Program statements between the !$OMP lines
    
       ... are executed in parallel by all threads      
    
       ...
    
     !$OMP  END DirectiveName 
    
    

     

  • Program statements between the 2 red lines are executed by multiple threads
  •  



     

     

  • Setting the level of parallellism in OpenMP programs

     

  • The number of threads that will be created to execute parallel sections in an OpenMP program is controlled by the environment variable OMP_NUM_THREADS

     

  • To set this environment variable use:
      export OMP_NUM_THREADS=...            
    
    
    
    Example:
    
    
    
      export OMP_NUM_THREADS=8
    
    
  •  



     

     

  • Compiling OpenMP programs
    • Fortran
      • Compile:
          f90 -O -c -xopenmp -stackvar Prog.f90    
        
        

         

         

      • Link:
          f90 -O -o Executable \
        
             -xopenmp -stackvar \
        
             Prog1.o Prog2.o ....
        
        


     

     

  • Introductory Example
    • Parallel "Hello World" OpenMP program:
         PROGRAM  Main
      
      
      
         !$OMP PARALLEL
      
      
      
         print *, "Hello World !"                 
      
      
      
         !$OMP END PARALLEL
      
      
      
         END
      
      

       

       

    • Example Program: (Demo above code)                                                

       

       

    • Compile with:
          f90 -O
      -xopenmp -stackvar
        openMP01.f90

       

       

    • Run with:
      • export OMP_NUM_THREADS=8
      • a.out

      Make sure you do it on compute.

      You will see "Hello World !!!" printed EIGHT times !!! (Remove the #pragma line and you get ONE line)....

       



     

     

  • Defining shared and private (non-shared) variables in parallel section

     

  • Recall:
    • There is no scopes in Fortran

    Fortran uses option keywords to define private (non-shared) (and shared) variables....


     

     

  • Defining shared and private variables in a PARALLEL section
    • A variable is by default shared among all threads

       

    • A private variable in a PARALLE section must be specified using the option PRIVATE

     

     

  • Fortran example of SHARED variable:
       PROGRAM  Main
    
       IMPLICIT NONE
    
    
    
       integer :: N         ! Shared
    
    
    
       N = 1001
    
       print *, "Before parallel section: N = ", N            
    
    
    
       !$OMP PARALLEL
    
       N = N + 1
    
       print *, "Inside parallel section: N = ", N
    
       !$OMP END PARALLEL
    
    
    
       print *, "After parallel section: N = ", N
    
       END
    
    

     

  • Example Program: (Demo above code)                        
    • Prog file: (Shared variable in OpenMP) --- click here

     

     

  • Compile with:
        f90 -O
    -xopenmp -stackvar
      openMP02a.f90

     

     

  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out

    You should see the value for N at the end is not always 1009, it could be less. This is evidence of asynchronous update.

  •  



     

     

  • Fortran example of NON-SHARED (private) variable:
       PROGRAM  Main
    
       IMPLICIT NONE
    
    
    
       integer :: N         ! Shared
    
    
    
       N = 1001
    
       print *, "Before parallel section: N = ", N
    
    
    
       !$OMP PARALLEL PRIVATE(N)
    
       N = N + 1
    
       print *, "Inside parallel section: N = ", N
    
       !$OMP END PARALLEL
    
    
    
       print *, "After parallel section: N = ", N
    
       END
    
    

     

  • Example Program: (Demo above code)                        
    • Prog file: (Private variable in OpenMP) --- click here

     

     

  • Compile with:
        f90 -O
    -xopenmp -stackvar
      openMP02b.f90

     

     

  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out

     

     

  • Output:
        Before parallel section: N =  1001            
    
        Inside parallel section: N =  1
    
        Inside parallel section: N =  1
    
        Inside parallel section: N =  1
    
        Inside parallel section: N =  1
    
        Inside parallel section: N =  1
    
        Inside parallel section: N =  1
    
        Inside parallel section: N =  1
    
        Inside parallel section: N =  1
    
        After parallel section: N =  1001
    
    

    Each thread has its own variable N

    This variable N is different from the "program" variable defined in the main program !!!

  •  



     

     

  • OpenMP Support function

     

  • Most useful support functions in OpenMP:
    Function Name Effect
    omp_set_num_threads(int nthread) Set size of thread team
    INTEGER omp_get_num_threads() return size of thread team
    INTEGER omp_get_max_threads() return max size of thread team (typically equal to the number of processors
    INTEGER omp_get_thread_num() return thread ID of the thread that calls this function
    INTEGER omp_get_num_procs() return number of processors
    LOGICAL omp_in_parallel() return TRUE if currently in a PARALLEL segment

     

     

  • Here is a simple OMP program in Fortran:
       PROGRAM  Main
    
       IMPLICIT NONE
    
    
    
       INTEGER :: nthreads, myid
    
       INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS
    
    
    
    
    
       !$OMP PARALLEL private(nthreads, myid)
    
    
    
    
    
       myid = OMP_GET_THREAD_NUM()
    
    
    
       print *, "Hello I am thread ", myid
    
    
    
       if (myid == 0) then
    
          nthreads = OMP_GET_NUM_THREADS()
    
          print *, "Number of threads = ", nthreads
    
       end if
    
    
    
       !$OMP END PARALLEL
    
    
    
       END
    
    

     

     

  • Example Program: (OpenMP Fortran program) --- click here        

     

  • Compile using the following command:
        f90 -O
    -xopenmp -stackvar
      hello.f90

     

     

  • Run with:
    • export OMP_NUM_THREADS=8
    • a.out

     

     

  • Output:
      Hello I am thread  7
    
      Hello I am thread  5
    
      Hello I am thread  1
    
      Hello I am thread  0
    
      Hello I am thread  2
    
      Number of threads =  8
    
      Hello I am thread  4
    
      Hello I am thread  3
    
      Hello I am thread  6
    
    
  •  



     

     

  • Caveat with Fortran
    • Recall:
      • Array indices in Fortran by default start with 1 (ONE)

         

       

       

    • Observed from "Hello" program:
      • Thread IDs start with 0 (ZERO)

         

       

       

    • Caveat:
      • Use ThreadID+1 as index to an array in Fortran !!!

         



     

     

  • Example OpenMP Program: Find minimum in an array
    • A sequential program in C++ can be found here: ( click here )

       

    • We will write this program using OpenMP in Fortran

       

       

    • Parallel Find Min program in Fortran:
        PROGRAM Min
      
         IMPLICIT NONE
      
      
      
         INTEGER, PARAMETER :: MAX = 10000000
      
      
      
         DOUBLE PRECISION, DIMENSION(MAX) :: x
      
         DOUBLE PRECISION, DIMENSION(10)  :: my_min
      
         DOUBLE PRECISION :: rmin
      
      
      
         INTEGER :: num_threads
      
         INTEGER :: i, n
      
         INTEGER :: id, start, stop
      
      
      
         ! ===========================================================
      
         ! Declare the OpenMP functions
      
         ! ===========================================================     
      
         INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS
      
      
      
      
      
        ! ===================================
      
        ! Parallel section: Find local minima
      
        ! ===================================
      
      !$OMP  PARALLEL  PRIVATE(i, id, start, stop, num_threads, n)
      
      
      
         num_threads = omp_get_num_threads()
      
         n = MAX/num_threads
      
      
      
         id = omp_get_thread_num()
      
      
      
         ! ----------------------------------
      
         ! Find my own starting index
      
         ! ----------------------------------
      
         start = id * n + 1          !! Array start at 1
      
      
      
         ! ----------------------------------
      
         ! Find my own stopping index
      
         ! ----------------------------------
      
         if ( id <> (num_threads-1) ) then
      
            stop = start + n
      
         else
      
            stop = MAX
      
         end if
      
      
      
         ! ----------------------------------
      
         ! Find my own min
      
         ! ----------------------------------
      
         my_min(id+1) = x(start)
      
      
      
         DO i = start+1, stop
      
            IF ( x(i) < my_min(id+1) ) THEN
      
               my_min(id+1) = x(i)
      
            END IF
      
         END DO
      
      
      
      !$OMP END PARALLEL
      
      
      
      
      
        ! ===================================
      
        ! Find min over the local minima
      
        ! ===================================
      
         rmin = my_min(1)
      
      
      
         DO i = 2, num_threads
      
            IF ( rmin < my_min(i) ) THEN
      
               rmin = my_min(i)
      
            END IF
      
         END DO
      
      
      
         print *, "min = ", rmin
      
         END PROGRAM
      
      

       

       

    • Example Program: (Demo above code)                                                
          f90 -O
      -xopenmp -stackvar
        min-mt1.f90

       

       

    • Run with:
      • export OMP_NUM_THREADS=8
      • a.out


     

     

  • Mutual exclusion synchronization Primitives

     

  • This mutual exclusion effect in Fortran is achieved in OpenMP using the following pragma:
    
    
       !$OMP CRITICAL
    
    
    
           ... statements are guaranteed to be executed
    
           ,,, by ONE thread at any one time
    
    
    
    
    
       !$OMP END CRITICAL
    
    
  •  



     

     

  • Example OpenMP program with synchronization: compute Pi

     

  • Example:
      PROGRAM Compute_PI
    
       IMPLICIT NONE
    
    
    
    
    
       INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS     
    
    
    
       INTEGER           N, i
    
       INTEGER           id, num_threads
    
       DOUBLE PRECISION  w, x, sum
    
       DOUBLE PRECISION  pi, mypi
    
    
    
    
    
       N = 50000000         !! Number of intervals
    
       w = 1.0d0/N          !! width of each interval
    
    
    
       sum = 0.0d0
    
    
    
    !$OMP    PARALLEL PRIVATE(i, id, num_threads, x, mypi)
    
    
    
       num_threads = omp_get_num_threads()
    
       id = omp_get_thread_num()
    
    
    
       mypi = 0.0d0;
    
    
    
       DO i = id,   N-1,   num_threads
    
         x = w * (i + 0.5d0)
    
         mypi = mypi + w*f(x)
    
       END DO
    
    
    
    
    
    !$OMP CRITICAL
    
       pi = pi + mypi
    
    !$OMP END CRITICAL
    
    
    
    
    
    !$OMP    END PARALLEL
    
    
    
       PRINT *, "Pi = ", pi
    
    
    
       END PROGRAM
    
    
    
    

     

  • Example Program: (OpenMP compute Pi) --- click here        

     

     

  • Compile with:
        f90 -O
    -xopenmp -stackvar
      openMP_compute_pi2.f90

     

     

  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out
  •  




     

     

  • Parallel For Loop in OpenMP

    The division of labor (splitting the work of a for-loop) of a for-loop can be done in OpenMP through a special Parallel LOOP construct.

     

  • A Parallel Loop construct MUST appear within a Parallel region of the program !

     

  • The syntax of a Parallel LOOP construct in Fortran is:
    
    
       !$OMP    DO
    
    
    
          DO  index = ....
    
              ....            ! Division of labor is taken care of       
    
    			  ! by the Fortran compiler
    
          END DO
    
    
    
       !$OMP    END DO
    
    

     

     

  • The meaning of this Parallel LOOP construct is to distribute the iterations in the for-loop (or do-loop) among the threads.

    Each iteration of the for-loop is executed exactly once by each thread.

    The loop variable used in the Parallel LOOP construct is by default PRIVATE (other variables are still by default SHARED)


     

     

  • Example: compute Pi with parallel DO loop
      PROGRAM Compute_PI
    
       IMPLICIT NONE
    
    
    
       INTEGER           N, i, num_threads
    
       DOUBLE PRECISION  w, x, sum
    
       DOUBLE PRECISION  pi, mypi
    
    
    
    
    
       N = 50000000         !! Number of intervals
    
       w = 1.0d0/N          !! width of each interval
    
    
    
       sum = 0.0d0
    
    
    
    !$OMP    PARALLEL PRIVATE(x, mypi)
    
    
    
       mypi = 0.0d0;
    
    
    
    !$OMP    DO DO i = 0, N-1 !! Parallel Loop x = w * (i + 0.5d0) mypi = mypi + w*f(x) END DO
    
    !$OMP    END DO
    
    
    
    
    
    !$OMP CRITICAL
    
       pi = pi + mypi
    
    !$OMP END CRITICAL
    
    
    
    
    
    !$OMP    END PARALLEL
    
    
    
       PRINT *, "Pi = ", pi
    
    
    
       END PROGRAM
    
    
    
    

     

  • Example Program: (OpenMP compute Pi) --- click here        

     

     

  • Compile with:
        f90 -O
    -xopenmp -stackvar
      openMP_compute_pi3.f90

     

     

  • Run with:
    • export OMP_NUM_THREADS=8
    • a.out
  •  



     

     

  • Final Notes

     

  • The stack size of each thread can be controlled by setting another environment variable:
      setenv   STACKSIZE    nBytes       
    
    

     

     

  • For more information on OpenMP, see: http://www.openmp.org





你可能感兴趣的:(fortran)