openMPI使用说明

测试Open MPI的安装
命令"ompi_info"可用来检测Open MPI的安装状态。该命令会返回一个关于你的Open MPI安装的大概信息。

Note that the ompi_info command is extremely helpful in determining which components are installed as well as listing all the run-time settable parameters that are available in each component (as well as their default values).

下列选项可能会有用:

--all       Show a *lot* of information about your Open MPI installation.

--parsable Display all the information in an easily grep/cut/awk/sed-able format.

--param

            A of "all" and a of "all" will show all parameters to all components. Otherwise, the parameters of all the components in a specific framework, or just the parameters of a specific component can be displayed by using an appropriate and/or name.

Changing the values of these parameters is explained in the "The Modular Component Architecture (MCA)" section, below.

编译Open MPI应用
Open MPI提供了可用来编译MPI应用的"wrapper"编译器:

C:          mpicc

C++:        mpiCC (or mpic++ if your filesystem is case-insensitive)

Fortran 77: mpif77

Fortran 90: mpif90

例如:

shell$ mpicc hello_world_mpi.c -o hello_world_mpi -g

All the wrapper compilers do is add a variety of compiler and linker flags to the command line and then invoke a back-end compiler. To be specific: the wrapper compilers do not parse source code at all; they are solely command-line manipulators, and have nothing to do with the actual compilation or linking of programs. The end result is an MPI executable that is properly linked to all the relevant libraries.

运行Open MPI应用

Open MPI提供了mpirun 和 mpiexec(他们是相同的)。例如:

shell$ mpirun -np 2 hello_world_mpi

shell$ mpiexec -np 1 hello_world_mpi : -np 1 hello_world_mpi

是相同的。某些mpiexec的switches(比如-host 和 –arch)are not yet functional,尽管在你使用它们时并不会发生错误。

The rsh launcher接受一个-hostfile参数(也可以使用选项"-machinefile",两者是等价的);你可以指定一个-hostfile参数来标志一个标准的mpirun-style hostfile(每行一个主机名):

shell$ mpirun -hostfile my_hostfile -np 2 hello_world_mpi

如果你想在一个节点上运行多个进程,那么hostfile 可以使用 "slots" 属性。如果没有指定"slots",那么将假设其数目为1。例如,使用如下的hostfile:

---------------------------------------------------------------------------

node1.example.com

node2.example.com

node3.example.com slots=2

node4.example.com slots=4

---------------------------------------------------------------------------

shell$ mpirun -hostfile my_hostfile -np 8 hello_world_mpi

will launch MPI_COMM_WORLD rank 0 on node1, rank 1 on node2, ranks 2 and 3 on node3, and ranks 4 through 7 on node4.

Other starters, such as the batch scheduling environments, do not require hostfiles (and will ignore the hostfile if it is supplied). They will also launch as many processes as slots have been allocated by the scheduler if no "-np" argument has been provided. For example, running an interactive SLURM job with 8 processors:

shell$ srun -n 8 -A

shell$ mpirun a.out

The above command will launch 8 copies of a.out in a single MPI_COMM_WORLD on the processors that were allocated by SLURM.

Note that the values of component parameters can be changed on the mpirun / mpiexec command line. This is explained in the section below, "The Modular Component Architecture (MCA)".

The Modular Component Architecture (MCA)
The MCA is the backbone of Open MPI -- most services and functionality are implemented through MCA components. Here is a list of all the component frameworks in Open MPI:

---------------------------------------------------------------------------

MPI 组件框架:

-------------------------

allocator - Memory allocator

bml       - BTL management layer

btl       - MPI point-to-point byte transfer layer, used for MPI point-to-point messages on some types of networks

coll      - MPI collective algorithms

io        - MPI-2 I/O

mpool     - Memory pooling

mtl       - Matching transport layer, used for MPI point-to-point messages on some types of networks

osc       - MPI-2 one-sided communications

pml       - MPI point-to-point management layer

rcache    - Memory registration cache

topo      - MPI topology routines

Back-end run-time environment component frameworks:

---------------------------------------------------

errmgr    - RTE error manager

gpr       - General purpose registry

iof       - I/O forwarding

ns        - Name server

odls      - OpenRTE daemon local launch subsystem

oob       - Out of band messaging

pls       - Process launch system

ras       - Resource allocation system

rds       - Resource discovery system

rmaps     - Resource mapping system

rmgr      - Resource manager

rml       - RTE message layer

schema    - Name schemas

sds       - Startup / discovery service

smr       - State-of-health monitoring subsystem

Miscellaneous frameworks:

-------------------------

backtrace - Debugging call stack backtrace support

maffinity - Memory affinity

memory    - Memory subsystem hooks

memcpy    - Memopy copy support

memory    - Memory management hooks

paffinity - Processor affinity

timer     - High-resolution timers

---------------------------------------------------------------------------

Each framework typically has one or more components that are used at run-time. For example, the btl framework is used by MPI to send bytes across underlying networks. The tcp btl, for example, sends messages across TCP-based networks; the gm btl sends messages across GM Myrinet-based networks.

Each component typically has some tunable parameters that can be changed at run-time. Use the ompi_info command to check a component to see what its tunable parameters are. For example:

shell$ ompi_info --param btl tcp

shows all the parameters (and default values) for the tcp btl component.

These values can be overridden at run-time in several ways. At run-time, the following locations are examined (in order) for new values of parameters:

1. /etc/openmpi-mca-params.conf

This file is intended to set any system-wide default MCA parameter values -- it will apply, by default, to all users who use this Open MPI installation. The default file that is installed contains many comments explaining its format.

2. $HOME/.openmpi/mca-params.conf

If this file exists, it should be in the same format as /etc/openmpi-mca-params.conf. It is intended to provide per-user default parameter values.

3. environment variables of the form OMPI_MCA_ set equal to a

Where is the name of the parameter. For example, set the variable named OMPI_MCA_btl_tcp_frag_size to the value 65536 (Bourne-style shells):

   shell$ OMPI_MCA_btl_tcp_frag_size=65536

   shell$ export OMPI_MCA_btl_tcp_frag_size

4. the mpirun command line: --mca

   Where is the name of the parameter. For example:

   shell$ mpirun --mca btl_tcp_frag_size 65536 -np 2 hello_world_mpi

These locations are checked in order. For example, a parameter value passed on the mpirun command line will override an environment variable; an environment variable will override the system-wide defaults.

 

、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、

1.运行mpirun出现如下错误:
duanple@node1:~/project/test$ mpirun -hostfile hosts -np 2 ./a.out
bash: orted: command not found
--------------------------------------------------------------------------
A daemon (pid 27974) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished

解决方式:虽然在/etc/profile里已经设定了PATH以及LD_LIBRARY_PATH,但是这个问题好像必须要重新在~/.bashrc里重新设置PATH以及LD_LIBRARY_PATH后,才可以正常运行。

2.在运行openmpi的三个节点中,有2个32位x86节点,一个x64节点,运行时报错。
提示:reconfigure Open MPI with --enable-heterogeneous.
也就是说openmpi默认安装不支持异构节点,如果需要支持必须打开改选项,关于configure选项,参见openmpi-1.4.1下的README文件。

因此需要重新配置安装openmpi:命令如下

cd /home/duanple/mpi/openmpi;rm -rf *
cd /home/duanple/project/install/openmpi-1.4.1;./configure --prefix=/home/duanple/mpi/openmpi --enable-heterogeneous --enable-sparse-groups --enable-static
make all install

scp mpi.c client:/home/duanple/project/test;scp mpi.c node2:/home/duanple/project/test
cd /home/duanple/project/test
mpirun -hostfile hosts -np 3 ./a.out

3.使用service sshd restart,或者ifconfig 总是出现如下提示:command not found.
原因是/sbin未被加入PATH,只要修改/etc/profile或者~/.bashrc,然后source它们即可

你可能感兴趣的:(parallel)