A: MPICH2 is a freely available, portable implementation of MPI, the Standard for message-passing libraries. It implements MPI-1, MPI-2 and MPI-2.1.
A: MPI stands for Message Passing Interface. The CH comes from Chameleon, the portability layer used in the original MPICH to provide portability to the existing message-passing systems.
A: There are two common ways to use MPI with multicore processors or multiprocessor nodes:
A: MPD is the default process manager for MPICH2 on Unix platforms. It is written in Python. SMPD is the primary process manager for MPICH2 on Windows. It is also used for running on a combination of Windows and Linux machines. It is written in C.
A: No, in many cases you can build MPICH2 using one set of compilers and then use the libraries (and compilation scripts) with other compilers. However, this depends on the compilers producing compatible object files. Specifically, the compilers must
The above may seem like a stringent set of requirements, but in practice, many systems and compiler sets meet these needs, if for no other reason than that any software built with multiple libraries will have requirements similar to those of MPICH2 for compatibility.
If your compilers are completely compatible, down to the runtime libraries, you may use the compilation scripts (mpicc etc.) by either specifying the compiler on the command line, e.g.
mpicc -cc=icc -c foo.c
or with the environment variables MPICH_CC etc. (this example assume a c-shell syntax):
setenv MPICH_CC icc mpicc -c foo.c
If the compiler is compatible except for the runtime libraries, then this same format works as long as a configuration file that describes the necessary runtime libraries is created and placed into the appropriate directory (the "sysconfdir" directory in configure terms). See the installation manual for more details.
In some cases, MPICH2 is able to build the Fortran interfaces in a way that supports multiple mappings of names from the Fortran source code to the object file. This is done by using the "multiple weak symbol" support in some environments. For example, when using gcc under Linux, this is the default.
A: You have several options. One is to use the Fortran 90 compiler for both F77 and F90. Another (if you do not need Fortran 90) is to use --disable-f90 when configuring. The options with which we test MPICH2 and the Absoft compilers are the following:
setenv FFLAGS "-f -B108" setenv F90FLAGS "-YALL_NAMES=LCS -B108" setenv F77 f77 setenf F90 f90
A: FD_ZERO is part of the support for the select calls (see ``man select or ``man 2 select on Linux and many other Unix systems) . What this means is that your system (probably a Mac) has a broken version of the select call and related data types. This is an OS bug; the only repair is to update the OS to get past this bug. This test was added specifically to detect this error; if there was an easy way to work around it, we would have included it (we don't just implement FD_ZERO ourselves because we don't know what else is broken in this implementation of select).
If this configure works with gcc but not with xlc, then the problem is with the include files that xlc is using; since this is an OS call (even if emulated), all compilers should be using consistent if not identical include files. In this case, you may need to update xlc.
A: The g95 compiler incorrectly defines the default Fortran integer as a 64-bit integer while defining Fortran reals as 32-bit values (the Fortran standard requires that INTEGER and REAL be the same size). This was apparently done to allow a Fortran INTEGER to hold the value of a pointer, rather than requiring the programmer to select an INTEGER of a suitable KIND. To force the g95 compiler to correctly implement the Fortran standard, use the -i4 flag. For example, set the environment variable F90FLAGS before configuring MPICH2:
setenv F90FLAGS "-i4"
G95 users should note that there (at this writing) are two distributions of g95 for 64-bit Linux platforms. One uses 32-bit integers and reals (and conforms to the Fortran standard) and one uses 32-bit integers and 64-bit reals. We recommend using the one that conforms to the standard (note that the standard specifies the ratio of sizes, not the absolute sizes, so a Fortran 95 compiler that used 64 bits for both INTEGER and REAL would also conform to the Fortran standard. However, such a compiler would need to use 128 bits for DOUBLE PRECISION quantities).
sock.c:8:24: mpidu_sock.h: No such file or directory In file included from sock.c:9: ../../../../include/mpiimpl.h:91:21: mpidpre.h: No such file or directory In file included from sock.c:9: ../../../../include/mpiimpl.h:1150: error: syntax error before "MPID_VCRT" ../../../../include/mpiimpl.h:1150: warning: no semicolon at end of struct or union
A: Check if you have set the envirnoment variable CPPFLAGS. If so, unset it and use CXXFLAGS instead. Then rerun configure and make.
mpidu_process_locks.h:234:2: error: /#error *** No atomic memory operation specified to implement busy locks ***
A: The ssm channel does not work on all platforms because they use special interprocess locks (often assembly) that may not work with some compilers or machine architectures. It works on Linux with gcc, Intel, and Pathscale compilers on various Intel architectures. It also works in Windows and Solaris environments.
A: Check the output of the configure step. If configure claims that ifort is a cross compiler, the likely problem is that programs compiled and linked with ifort cannot be run because of a missing shared library. Try to compile and run the following program (named conftest.f90):
program conftest integer, dimension(10) :: n end
If this program fails to run, then the problem is that your installation of ifort either has an error or you need to add additional values to your environment variables (such as LD_LIBRARY_PATH). Check your installation documentation for the ifort compiler. See http://softwareforums.intel.com/ISN/Community/en-US/search/SearchResults.aspx?q=libimf.so for an example of problems of this kind that users are having with version 9 of ifort.
If you do not need Fortran 90, you can configure with --disable-f90.
A: Parallel make (often invoked with make -j4) will cause several job steps in the build process to update the same library file (libmpich.a) concurrently. Unfortunately, neither the ar nor the ranlib programs correctly handle this case, and the result is a corrupted library. For now, the solution is to not use a parallel make when building MPICH2.
A: This is really a problem in the MPI-2 standard. And good or bad, the MPICH2 implementation has to adhere to it. The root cause of this error is that both stdio.h and the MPI C++ interface use SEEK_SET, SEEK_CUR, and SEEK_END. You can try adding:
#undef SEEK_SET #undef SEEK_END #undef SEEK_CUR
before mpi.h is included, or add the definition
-DMPICH_IGNORE_CXX_SEEK
to the command line (this will cause the MPI versions of SEEK_SET etc. to be skipped).
A: This is caused by buggy C++ compilers not implementing part of the C++ standard. To work around this problem, add the definition:
-DHAVE_NO_VARIABLE_RETURN_TYPE_SUPPORT
to the CXXFLAGS variable or add a:
#define HAVE_NO_VARIABLE_RETURN_TYPE_SUPPORT 1
before including mpi.h
A: The specific method depends on the process manager and version of mpiexec that you are using. See the appropriate specific section.
A: By default, all the environment variables in the shell where mpiexec is run are passed to all processes of the application program. (The one exception is LD_LIBRARY_PATH when using MPD and the mpd's are being run as root.) This default can be overridden in many ways, and individual environment variables can be passed to specific processes using arguments to mpiexec. A synopsis of the possible arguments can be listed by typing:
mpiexec -help
and further details are available in the Users Guide here: http://www.mcs.anl.gov/research/projects/mpich2/documentation/index.php?s=docs.
A: Where processes run, whether by default or by specifying them yourself, depends on the process manager being used.
If you are using the gforker process manager, then all MPI processes run on the same host where you are running mpiexec.
If you are using the mpd process manager, which is the default, then many options are available. If you are using mpd, then before you run mpiexec, you will have started, or will have had started for you, a ring of processes called mpd's (multi-purpose daemons), each running on its own host. It is likely, but not necessary, that each mpd will be running on a separate host. You can find out what this ring of hosts consists of by running the program mpdtrace. One of the mpd's will be running on the ``local machine, the one where you will run mpiexec. The default placement of MPI processes, if one runs
mpiexec -n 10 a.out
is to start the first MPI process (rank 0) on the local machine and then to distribute the rest around the mpd ring one at a time. If there are more processes than mpd's, then wraparound occurs. If there are more mpd's than MPI processes, then some mpd's will not run MPI processes. Thus any number of processes can be run on a ring of any size. While one is doing development, it is handy to run only one mpd, on the local machine. Then all the MPI processes will run locally as well.
The first modification to this default behavior is the -1 option to mpiexec (not a great argument name). If -1 is specified, as in
mpiexec -1 -n 10 a.out
then the first application process will be started by the first mpd in the ring after the local host. (If there is only one mpd in the ring, then this will be on the local host.) This option is for use when a cluster of compute nodes has a ``head node where commands like mpiexec are run but not application processes.
If an mpd is started with the --ncpus option, then when it is its turn to start a process, it will start several application processes rather than just one before handing off the task of starting more processes to the next mpd in the ring. For example, if the mpd is started with
mpd --ncpus=4
then it will start as many as four application processes, with consecutive ranks, when it is its turn to start processes. This option is for use in clusters of SMP's, when the user would like consecutive ranks to appear on the same machine. (In the default case, the same number of processes might well run on the machine, but their ranks would be different.)
(A feature of the --ncpus=[n] argument is that it has the above effect only until all of the mpd's have started n processes at a time once; afterwards each mpd starts one process at a time. This is in order to balance the number of processes per machine to the extent possible.)
Other ways to control the placement of processes are by direct use of arguments to mpiexec. See the Users Guide here: http://www.mcs.anl.gov/research/projects/mpich2/documentation/index.php?s=docs
A: Output to stdout and stderr may not be written from your process immediately after a printf or fprintf (or PRINT in Fortran) because, under Unix, such output is buffered unless the program believes that the output is to a terminal. When the program is run by mpiexec, the C standard I/O library (and normally the Fortran runtime library) will buffer the output. For C programmers, you can either use a call fflush(stdout) to force the output to be written or you can set no buffering by calling:
#include <stdio.h> setvbuf( stdout, NULL, _IONBF, 0 );
on each file descriptor (stdout in this example) which you want to send the output immedately to your terminal or file.
There is no standard way to either change the buffering mode or to flush the output in Fortran. However, many Fortrans include an extension to provide this function. For example, in g77,
call flush()
can be used. The xlf compiler supports
call flush_(6)
where the argument is the Fortran logical unit number (here 6, which is often the unit number associated with PRINT). With the G95 Fortran 95 compiler, set the environment variable G95_UNBUFFERED_6 to cause output to unit 6 to be unbuffered.
A: By default, g95 does not flush output to stdout. This also appears to cause problems for standard input. If you are using the Fortran logical units 5 and 6 (or the * unit) for standard input and output, set the environment variable G95_UNBUFFERED_6 to yes.
A: To run MPI programs in the background when using MPD, you need to redirect stdin from /dev/null. For example:
mpiexec -n 4 a.out < /dev/null &
A: To use MPICH2 with slurm, you have to configure MPICH2 with the following options:
./configure --with-pmi=slurm --with-pm=no
In addition, if your slurm installation is not in the default location, you will need to pass the actual installation location using:
./configure --with-pmi=slurm --with-pm=no --with-slurm=[path_to_slurm_install]
A: This problem occurs when there is a mismatch between the process manager (PM) used and the process management interface (PMI) with which the MPI application is compiled.
MPI applications use process managers to launch them as well as get information such as their rank, the size of the job, etc. MPICH2 specified an interface called the process management interface (PMI) that is a set of functions that MPICH2 internals (or the internals of other parallel programming models) can use to get such information from the process manager. However, this specification did not include a wire protocol, i.e., how the client-side part of the PMI would talk to the process manager. Thus, many groups implemented their own PMI library in ways that were not compatible with each other with respect to the wire protocol (the interface is still common and as specified). Some examples of PMI library implementations are: (a) simple PMI (MPICH2's default PMI library), (b) smpd PMI (for linux/windows compatibility; will be deprecated soon) and (c) slurm PMI (implemented by the slurm guys).
MPD, Gforker, Remshell, Hydra, OSC mpiexec, OSU mpirun and probably many other process managers use the simple PMI wire protocol. So, as long as the MPI application is linked with the simple PMI library, you can use any of these process managers interchangeably. Simple PMI library is what you are linked to by default when you build MPICH2 using the default options.
srun uses slurm PMI. When you configure MPICH2 using --with-pmi=slurm, it links with the slurm PMI library. Only srun is compatible with this slurm PMI library, so only that can be used. The slurm folks came out with their own "mpiexec" executable, which essentially wraps around srun, so that uses the slurm PMI as well.
So, in some sense, mpiexec or srun is just a user interface for you to talk in the appropriate PMI wire protocol. If you have a mismatch, the MPI process will not be able to detect their rank, the job size, etc., so all processes think they are rank 0.
A: The MPICH_PORT_RANGE environment variable allows you to specify the range of TCP ports to be used by the process manager and the MPICH2 library. Set this variable before starting your application with mpiexec. The format of this variable is <low>:<high>. For example, to allow the job launcher and MPICH2 to use ports only between 10000 and 10100, if you're using the bash shell, you would use:
export MPICH_PORT_RANGE=10000:10100
A: The default channel in MPICH2 (starting with the 1.1 series) is ch3:nemesis
. This channel uses busy polling in order to improve intranode shared-memory communication performance. The downside to this is that performance will generally take a dramatic hit if you oversubscribe your nodes. Oversusbscription is the case where you run more processes on a node than there are cores on the node. In this scenario, you have a few choices:
--with-device=ch3:sock
. This will use the older ch3:sock
channel that does not busy poll. This channel will be slower for intra-node communication, but it will perform much better in the oversubscription scenario.