http://wiki.seas.harvard.edu/geos-chem/index.php/Intel_Fortran_Compiler
This page contains information about the Intel Fortran Compiler (aka "IFORT" compiler). The IFORT compiler is the preferred compiler for use with GEOS-Chem.
Contents[hide]
|
NOTE: Recent Intel compiler versions (e.g. IFORT 12, IFORT 13) are now packaged and sold under the name Intel Fortran Composer XE (or something similar). Many GEOS-Chem users still use older versions (i.e. IFORT 11, IFORT 10).
You can find more information about the Intel Fortran Compiler here:
Also, normally when you installs the Intel Fortran compilers, you also will install the C and C++ compilers. These compilers are not needed for GEOS-Chem, but they will be needed if you install libraries (e.g. netCDF or HDF5) on your system.
--Bob Y. 11:53, 26 June 2012 (EDT)
The table below shows the wallclock time and mean OH for several GEOS-Chem simulations that were done in order to compare IFORT 10.1.013 vs. IFORT 11.1.069. The simulations had all these things in common:
Machine Type x86_64 Operating System Linux Operating System Release 2.6.18-128.1.6.el5_lustre.1.8.0 CPU Count 8 CPUs CPU Speed 2659 MHz Memory Total 12008076.000 KB Swap Space Total 18415600.000 KB
Run | IFORT version |
# CPUs | Wall clock (mm:ss) |
Parallel % | Mean OH (1e5 molec/cm3) |
---|---|---|---|---|---|
1 | 10.1.013 | 4 | 02:10:51 | 384.7% | 12.5205894678448 |
2 | 11.1.069 | 4 | 02:09:14 | 382.7% | 12.5217430768752 |
3 | 10.1.013 | 8 | 01:17:13 | 757.1% | 12.5205894678448 |
4 | 11.1.069 | 8 | 01:18:44 | 753.4% | 12.5213489686705 |
Here are some plots of surface ozone from the benchmarks simulations:
NOTES:
--Bob Y. 16:08, 25 March 2010 (EDT)
The table shows the wallclock time and mean OH for several GEOS-Chem simulations that were done in order to compare Intel Fortran Compiler (IFORT) v9.1 vs. v10.1.013. The simulations had all these things in common:
Run | IFORT version |
# CPUs | Optimization options | Wall clock (mm:ss) |
Speedup from IFORT 9.1 to IFORT 10.1 |
Speedup from 4 to 8 CPUs w/ the same compiler |
Mean OH (1e5 molec/cm3) |
---|---|---|---|---|---|---|---|
1 | 9.1 | 4 | -O2 | 36:16 | 11.2913755849576 | ||
2 | 10.1 | 4 | -O2 | 33:55 | 6.48% | 11.2913755842197 | |
3 | 9.1 | 4 | -O3 | 37:26 | 11.2913755849576 | ||
4 | 10.1 | 4 | -O3 | 33:36 | 10.24% | 11.2913755838124 | |
5 | 9.1 | 8 | -O2 | 24:15 | 33.13% | 11.2913755849576 | |
6 | 10.1 | 8 | -O2 | 22:46 | 6.12% | 32.88% | 11.2913755842197 |
7 | 9.1 | 8 | -O3 | 23:36 | 36.95% | 11.2913755849576 | |
8 | 10.1 | 8 | -O3 | 22:31 | 4.59% | 32.99% | 11.2913755838124 |
9 | 9.1 | 8 | -O3 -ipo -no-prec-div -static | 23:03 | 11.2913764967223 | ||
10 | 10.1 | 8 | -O3 -ipo -no-prec-div -static | 21:56 | 4.84% | 11.0809209646817 |
NOTES about the table:
PLOTS:
TAKE-HOME MESSAGE:
OUR RECOMMENDATIONS:
--Bob Y. 16:46, 16 April 2008 (EDT)
Upgrading to IFORT 10.1 does not seem to fix the stacksize problem listed below. You still need to manually reset the stacksize limit to a large positive number for both Linuxand Altix platforms.
--Bob Y. 12:41, 25 April 2008 (EDT)
In this section we present information about the various optimization options available in the Intel Fortran Compiler.
Here is a quick reference table of IFORT's optimization options (taken from the online Intel Fortran Compiler User and Reference Guides.
Option | Description |
---|---|
-O0 | Turns off all optimizations. Math expressions will be evaluated in the same order in which they are written, which is necessary for debugging. If you are using a debugger (such as Totalview), compile with -g -O0. |
-O1 | Enables optimizations for speed and disables some optimizations that increase code size and affect speed. The -O1 option may improve performance for applications with very large code size, many branches, and execution time not dominated by code within loops. Setting -O1 automatically sets the following options:
|
-O2 (aka -O) | Enables optimizations for speed. This is the generally recommended optimization level. This option also enables:
On Linux and Mac OS X systems, if -g is specified, -O2 is turned off and -O0 is the default unless -O2 (or -O1 or -O3) is explicitly specified in the command line together with -g. |
-O3 | Enables -O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations. Enables optimizations for maximum speed, such as:
On Linux and Mac OS X systems, the -O3 option sets option -fomitframe-pointer. The -O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to -O2 optimizations. The -O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets. |
--Bob Y. 16:28, 29 February 2012 (EST)
In this section, we present information about the compilation and optimization options that are invoked when you compile a GEOS-Chem simulation.
Here are the IFORT compilation options currently used by GEOS-Chem:
Option | Description |
---|---|
Normal compiler settings | |
-cpp | Turns on the C-preprocessor, to evaluate #if and #define statements in the source code. |
-w | Suppresses all compiler warnings. This is mainly a convenience to prevent excessive output to the screen or log file. NOTE: Most compiler warnings are harmless. Execution does not stop when a warning is displayed, unlike an error message, which causes program execution to halt at the point where the error occurred. |
-O2 | Optimizes the source code for speed, without taking too many liberties with numerical precision. For more information, please see the optimization options section above. |
-auto | This option places local variables (scalars and arrays of all types), except those declared as SAVE, on the run-time stack. It is as if the variables were declared with the AUTOMATIC attribute. It does not affect variables that have the SAVE attribute or ALLOCATABLE attribute, or variables that appear in an EQUIVALENCE statement or in a common block. |
-noalign | Prevents the compiler from padding bytes anywhere in common blocks and structures. Padding can affect numerical precision. |
-convert big_endian | Specifies that the format will be big endian for integer data and big endian IEEE floating-point for real and complex data. This only affects file I/O to/from binary files (such as binary punch files) but not ASCII, netCDF, or other file formats. |
-vec-report0 | Tells the compiler to suppress printing "LOOP HAS BEEN VECTORIZED" messages. This reduces the amount of output that is sent to the screen and/or GEOS-Chem log file. |
-fp-model source | Rounds intermediate results to source-defined precision and enables value-safe optimizations. Basically, this tells the compiler not to take too many liberties with how numerical expressions are evaluated. For more information about this option, please see our precision-safe optimization sectionbelow. This option can be disabled by compiling GEOS-Chem with the PRECISE=no Makefile option. |
Special compiler settings | |
-r8 | This option tells the compiler to treat variables that are declared as REAL as REAL*8 (as opposed to REAL*4. NOTE: This option is not used globally, but is only applied to certain indidvidual files (mostly from third-party codes like ISORROPIA. Current GEOS-Chem programming practice is to use either REAL*4 or REAL*8 instead of REAL, which avoids confusion. |
-mcmodel=medium | This option is used to tell IFORT to use more than 2GB of static memory. This avoids a specific type of memory error that can occur if you compile GEOS-Chem for use with an extremely high-resolution grid (e.g. 0.25° x 0.3125° nested grid). In GEOS-Chem v9-01-03 and higher versions, -mcmodel=medium is set by default. |
-i-dynamic (aka -shared-intel) |
This option needs to be used in conjunction with -mcmodel=medium. It causes Intel-provided libraries to be linked in dynamically instead of statically (which is the default). In GEOS-Chem v9-01-03 and higher versions, -i-dynamic is set by default. |
-ipo | This option enables interprocedural optimization between files. This is also called multifile interprocedural optimization (multifile IPO) or Whole Program Optimization (WPO). When you specify this option, the compiler performs inline function expansion for calls to functions defined in separate files. NOTE: Yuxuan Wang found that this option was useful for certain nested-grid simulations. See the this wiki post below for more information. |
-static | This option prevents linking with shared libraries. It causes the executable to link all libraries statically. NOTE: Yuxuan Wang found that this option was useful for certain nested-grid simulations. See the this wiki post below for more information. |
Settings only used for debugging | |
-g | Tells the compiler to generate full debugging information in the object file. This will cause a debugger (like Totalview) to display the actual lines of source code, instead of hexadecimal addresses (which is gibberish to anyone except hardware engineers). |
-O0 | Turns off all optmization. Source code instructions (e.g. DO loops, IF blocks) and numerical expressions are evaluated in precisely the order in which they are listed, without being internally rewritten by the optimizer. This is necessary for using a debugger (like Totalview). |
-CB | Check for array-out-of-bounds errors. This is invoked when you compile GEOS-Chem with the BOUNDS=yes Makefile option. NOTE: Only use -CB for debugging, as this option will cause GEOS-Chem to execute more slowly! |
-traceback | This option tells the compiler to generate extra information in the object file to provide source file traceback information when a severe error occurs at run time. When the severe error occurs, source file, routine name, and line number correlation information is displayed along with call stack hexadecimal addresses (program counter trace). This option increases the size of the executable program, but has no impact on run-time execution speeds. It functions independently of the debug option. |
--Bob Y. 17:34, 29 February 2012 (EST)
The normal GEOS-Chem build uses the following IFORT compiler flags:
-cpp -w -O2 -auto -noalign -convert big_endian -vec-report0 -fp-model source -openmp
whereas a debugging run (meant to execute in a debugger such as TotalView) will typically use these flags:
-cpp -w -O0 -auto -noalign -convert big_endian -g -CB -traceback
NOTE: In order to avoid running out of memory if you compiling GEOS-Chem at extremely high resolution (e.g. the 0.25° x 0.3125° nested grids), we recommend adding the following flags:
-mcmodel=medium -i-dynamic
These are automatically set when you compile with the NETCDF=yes or HDF=yes compiler options (in GEOS-Chem v9-01-03 and higher).
--Bob Y. 17:34, 29 February 2012 (EST)
You can use the following Intel Fortran Compiler options to select how aggressively you would like to optimize floating-point operations.
-fp-model fast
Example source code:
REAL T0, T1, T2; ... T0 = 4.0E + 0.1E + T1 + T2;
When this option is specified, the compiler applies the following semantics:
Using these semantics, the following shows some possible ways the compiler may interpret the original code:
REAL T0, T1, T2; ... T0 = (T1 + T2) + 4.1E;
or
REAL T0, T1, T2; ... T0 = (T1 + 4.1E) + T2;
-fp-model source (aka -fp-model precise)
Example source code:
REAL T0, T1, T2; ... T0 = 4.0E + 0.1E + T1 + T2;
When this option is specified, the compiler applies the following semantics:
Using these semantics, the following shows a possible way the compiler may interpret the original code:
REAL T0, T1, T2; ... T0 = ((4.1E + T1) + T2);
If you do not select any -fp-model option, the Intel Fortran Compiler will default to -fp-model fast. As you can see from the examples above, this may not optimize the code in the same way each time. This can lead to minor numerical noise in the output, as was seen in ISORROPIA II.
To avoid this situation, we recommend compiling all source code files with -fp-model source. This will be the new default in GEOS-Chem v9-01-02.
Reference: Intel® Fortran Floating-point Operations; Document Number: 315892-003US
--Bob Y. 17:01, 25 August 2011 (EDT)
Yuxuan Wang told us about the optimization options: -ipo and -static and said these options would speed up the simulations. I've tested these options on our system at Harvard. The run with the new options show very tiny differences (much less than 1% over 1 month) compared to a run with the old options only. For a full-chemistry run (43 tracers) on 4x5 resolution and 4 processors, the run time is about 10% shorter than previously.
These options are especially efficient to handle the transport. So in simulations with a faster chemistry (like tagged tracers simulations), we expect to see a higher gain in time. For example, the time for a methane run is shorten by about 30 %.
To use these options, compile GEOS-Chem with the IPO=yes Makefile option, e.g.
make -j4 IPO=yes
--Ccarouge 15:54, 8 September 2009 (EDT)
--Bob Y. 17:50, 29 February 2012 (EST)
If you would like to run your code in a debugger, such as Totalview, you must use the following compiler switches:
-g -O0
Using -O0 will ensure that the source code gets executed in the same order in which it is written (i.e. this disables all compiler optimizations). The -g switch will tell the debugger to display lines of source code instead of hexadecimal memory addresses (which are more or less gibberish unless you are a hardware engineer).
GEOS-Chem will add these switches automatically for you if you compile with the DEBUG=yes option.
--Bob Y. 15:28, 22 February 2012 (EST)
Overall machine memory limits are set with the Unix limit command. If you use csh or tcsh, you can set the following commands in your ~/.cshrc file:
# Max out machine limits limit cputime unlimited # NOTE: "Unlimited" is not truly unlimited. It limit filesize unlimited # will set the given limit to the maximum value limit datasize unlimited # determined by your hardware configuration limit stacksize unlimited limit coredumpsize unlimited limit memoryuse unlimited limit vmemoryuse unlimited limit descriptors unlimited limit memorylocked unlimited limit maxproc unlimited
Or if you use bash, you can add these to commands your ~/.bashrc file:
# Max out machine limits ulimit -t unlimited # cputime ulimit -f unlimited # filesize ulimit -d unlimited # datasize ulimit -s unlimited # stacksize ulimit -c unlimited # coredumpsize ulimit -m unlimited # memoryuse ulimit -v unlimited # vmemoryuse ulimit -n unlimited # descriptors ulimit -l unlimited # memorylocked ulimit -u unlimited # maxproc
NOTE: depending on your particular OS build (Linux, CentOS, Fedora, Ubuntu), not all of these limits will be used.
It is important to set the stacksize memory to the maximum value, because this will determine the amount of memory available for temporary variables, which are:
However, one quirk is that the stacksize memory for child processes (i.e. processes spawned by CPUs within !$OMP PARALLEL DO loops) are not set by the stacksize limit, but instead by the KMP_STACKSIZE environment variable. If KMP_STACKSIZE is not set with a high enough value, then your GEOS-Chem simulation may think it doesn't have enough memory to proceed, and may die with a segmentation fault error.
The fix for this situation is to make sure that you set KMP_STACKSIZE to a high value. It's OK if the value you give to KMP_STACKSIZE is larger than the largest amount of memory on your system. As long as it's set to a high positive number it will work.
If you use csh or tcsh, you can add this command to your ~/.cshrc file.
# Reset the child stack size to a large positive number # (It's OK if this is larger than the max value, it's just a kludge) setenv KMP_STACKSIZE 500000000
Or if you use bash, add this command to your ~/.bashrc file:
# Reset the child stack size to a large positive number # (It's OK if this is larger than the max value, it's just a kludge) KMP_STACKSIZE=500000000 export KMP_STACKSIZE
Resetting the KMP_STACKSIZE environment variable in this manner usually will correct the following errors:
--Bob Y. 11:18, 26 June 2012 (EDT)
Debra Weisenstein wrote:
ifort -cpp -w -O2 -auto -noalign -convert big_endian -vec-report0 -mcmodel=large -i-dynamic -fp-model source -openmp -Dmultitask -I../Headers -module ../mod -I/home/dkweis/include -c diag49_mod.F Fatal compilation error: Out of memory asking for 36864.
ifort -cpp -w -O2 -auto -noalign -convert big_endian -vec-report0 -fp-model source -openmp -Dmultitask -DAPM -I../Headers -module ../mod -c dao_mod.F Fatal compilation error: Out of memory asking for 20480.
Bob Yantosca wrote:
limit
[54 bmy]% limit datasize unlimited stacksize unlimited ... etc ...
--Bob Y. 14:58, 25 July 2013 (EDT)
Prasad Kasibhatla wrote:
ifort -cpp -w -O2 -auto -noalign -convert big_endian -vec-report0 -mcmodel=medium -i-dynamic -fp-model source -openmp -Dmultitask -I../Headers -module ../mod -I/opt/geos-netcdf-4/include -c -free strat_chem_mod.F90 : catastrophic error: **Internal compiler error: segmentation violation signal raised** Please report this error along with the circumstances in which it occurred in a Software Problem Report. Note: File and line given may not be explicit cause of this error. compilation aborted for strat_chem_mod.F90 (code 1) make[3]: *** [strat_chem_mod.o] Error 1 make[3]: Leaving directory `/nfs/fire/psk9/Code.v9-01-03/GeosCore' make[2]: *** [lib] Error 2 make[2]: Leaving directory `/nfs/fire/psk9/Code.v9-01-03/GeosCore' make[1]: *** [all] Error 2 make[1]: Leaving directory `/nfs/fire/psk9/Code.v9-01-03/GeosCore' make: *** [all] Error 2
Bob Yantosca wrote:
[67 bmy Code.v9-02]% ifort -V Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1.3.293 Build 20120212 Copyright (C) 1985-2012 Intel Corporation. All rights reserved. FOR NON-COMMERCIAL USE ONLY
! Determine mean tropopause level for the period !$OMP PARALLEL DO & !$OMP DEFAULT( SHARED ) & !$OMP PRIVATE( I, J ) DO J = 1,JJPAR DO I = 1,IIPAR LTP(I,J) = NINT( TPauseL(I,J) / TPauseL_Cnt ) ENDDO ENDDO !$OMP END PARALLEL DO
#if defined( NESTED_NA ) || defined( NESTED_CH ) || defined( NESTED_EU ) ! This method only works for a global domain. ! It could be modified for nested domains if the total mass flux across the ! boundaries during the period is taken into account. RETURN #endif
#if defined( NESTED_NA ) || defined( NESTED_CH ) || defined( NESTED_EU ) ! This method only works for a global domain. ! It could be modified for nested domains if the total mass flux across the ! boundaries during the period is taken into account. RETURN !------------------------------------------------------------------------------ ! Prior to 10/5/12: ! Since the rest of this code isn't needed for the nested grid, wrap it ! in an #else statement. This might help the code to compile for IFORT 12. !#endif !------------------------------------------------------------------------------ #else ! Determine mean tropopause level for the period !$OMP PARALLEL DO & !$OMP DEFAULT( SHARED ) & !$OMP PRIVATE( I, J ) DO J = 1,JJPAR DO I = 1,IIPAR LTP(I,J) = NINT( TPauseL(I,J) / TPauseL_Cnt ) ENDDO ENDDO !$OMP END PARALLEL DO ... etc ... #endif END SUBROUTINE Calc_STE
--Bob Y. 13:52, 5 October 2012 (EDT)
David Lary wrote:
ld: library not found for -lcrt1.10.6.o make[2]: *** [exe] Error 1 make[1]: *** [all] Error 2 make: *** [all] Error 2
Bob Yantosca replied:
/Developer/SDKs/MacOSX10.4u.sdk/usr/lib. (i.e. $(SDKROOT)/usr/lib)
--Bob Y. 10:43, 26 June 2012 (EDT)
Geert Vinken wrote:
biofuel_mod.F(363): error #5082: Syntax error, found IDENTIFIER 'X1' when expecting one of: ( % [ : . = => GENERIC8_1x1 = GENERIC_1x1 ----------------------------^ biofuel_mod.F(363): error #6404: This name does not have a type, and must have an explicit type. [GENERIC8_1] GENERIC8_1x1 = GENERIC_1x1 ------------------^ biofuel_mod.F(363): error #6460: This is not a field name that is defined in the encompassing structure. [X1] GENERIC8_1x1 = GENERIC_1x1 ----------------------------^ biofuel_mod.F(363): error #6366: The shapes of the array expressions do not conform. [X1] GENERIC8_1x1 = GENERIC_1x1 ----------------------------^ compilation aborted for biofuel_mod.F (code 1)
Bob Yantosca replied:
REAL*4, PARAMETER :: PI = 3.14159e0 REAL*8, PARAMETER :: PI = 3.141592658979323d0
REAL*4, PARAMETER :: PI = 3.14159e0_4 REAL*8, PARAMETER :: PI = 3.141592658979323e0_8
GENERIC8_1x1
--Bob Y. 18:13, 23 May 2012 (EDT)
If your code uses many large arrays, or if you are compiling an ultra-fine resolution version of GEOS-Chem (e.g. a 0.25° x 0.3125° GEOS-5.7.2 nested grid), then you may see this type of error:
Relocation truncated to fit: R_X86_64_32S against `.bss' Error"
The wording you get may differ slightly than the example shown above.
Long story short: IFORT is telling you that your program is trying to use more than 2GB of statically-allocated data (i.e. data space that is not declared with an ALLOCATABLE statement) at compile time. The default setting in IFORT is to expect to use less than 2GB of memory, so you are hitting the upper limit.
The solution is simple: recompile your code with the following compiler flags:
-mcmodel=medium -i-dynamic
The -mcmodel=medium flag will tell IFORT that you expect to use more than 2GB of statically-allocated memory in your program. However, this also requires that you use link using dynamic libraries instead of the normal shared libraries. Using the -i-dynamic flag will turn on the dynamic library linking. (Starting with GEOS-Chem v9-01-03, these compiler flags will be applied to the build sequence automatically.)
IMPORTANT NOTE! If your code links to any libraries such as HDF or netCDF, then you MUST rebuild each library, making sure that the Fortran and C compilers use the -mcmodel=medium option. Please see our Installing libraries for GEOS-Chem page for examples.
GEOS-Chem v9-01-03 and higher will automatically set these flags for you.
For more information, please see the following links:
--Bob Y. 10:45, 26 June 2012 (EDT)
--Bob Y. 10:45, 26 June 2012 (EDT)
You should use GEOS-Chem with IFORT 11.1.058 or higher versions. Please see the discussion below about problems in the earlier versions of IFORT 11.0.xxx:
Tzung-May Fu wrote:
=============================== GEOS-CHEM ERROR: STOP 30000 STOP at partition.f ===============================
Philippe Le Sager wrote:
Bob Yantosca wrote:
Bob Yantosca wrote:
--Bob Y. 16:50, 7 October 2009 (EDT)
Eric Sofen wrote:
--Eric Sofen 13:32, 22 October 2009
Yuxuan Wang wrote:
-cpp -w -static -fno-alias -O2 -safe_cray_ptr -no-prec-sqrt -no-prec-div -auto -noalign -convert big_endian
--Bob Y. 14:59, 4 November 2009 (EST)
Nicolas Bousserez wrote:
"OMP abort: Initializing libguide.so,
Linux node9 2.6.9-89.0.23.ELsmp #1 SMP Wed Mar 17 06:49:21 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
--Bob Y. 09:43, 8 April 2011 (EDT)
If you are using a the Intel Fortran Compiler version 11, you may encounter some incompatibilities with your operating system, which might require an OS upgrade.
Nicolas Bousserez wrote:
"OMP abort: Initializing libguide.so, but we but found libguide.so already initialized".
Linux node9 2.6.9-89.0.23.ELsmp #1 SMP Wed Mar 17 06:49:21 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
Nicolas Bousserez wrote:
Linux terra-01..as.harvard.edu 2.6.18-194.3.1.el5_lustre.1.8.4 #1 SMP Fri Jul 9 21:55:24 MDT 2010 x86_64 x86_64 x86_64 GNU/Linux
Linux kvm-12.s.as.harvard.edu 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
--Bob Y. 10:25, 13 April 2011 (EDT)
Prasad Kasibhatla reported an error in routine partition.f which caused a GEOS-Chem v9-01-01 execution to halt. Upon further investigation, we found that this error only occurs when compiling GEOS-Chem with the Intel Fortran Compiler version 10 (aka IFORT 10) when selecting -O2 optimization and -openmp parallelization.
We recommend that all GEOS-Chem users upgrade to Intel Fortran Compiler version 11 (aka IFORT 11). If you must use IFORT 10, then we recommend that you compile the entire code with the -O1 optimization option. For more information, please see this wiki post.
--Bob Y. 09:17, 29 June 2011 (EDT)
Please see this wiki post about how problems compiling the KPP solver with IFORT 9.1.
Hyperthreading is when a job uses more threads than there are actual CPU cores. I've noticed that using 16 threads ($OMP_NUM_THREADS = 16) on an 8-core system (2 x quad core Intel Nehalem X5570's) leads to a 15% speedup over using 8 threads. These tests were with GEOS-Chem v8-02-03, full chemistry, 2x2.5, ifort 10.1.021, and
FFLAGS = -cpp -w -O3 -auto -noalign -convert big_endian -g -traceback -CB -vec-report0.
This does not have a positive impact when using earlier generations of Intel chips (Harpertown or Clovertown).
--Daven Henze 1:42, 16 December 2009 (MDT)
Special care has to be taken when passing pointer arrays or sub-fields of dervied type objects to subroutines. If this is done incorrectly, it can cause a huge performance slowdown. Please see the discussion on our Passing array arguments efficiently in GEOS-Chem wiki page for full details.
--Bob Y. 10:49, 10 June 2013 (EDT)