首先准备两个机器,比如 host1 和 host2,设置这两个机器可以互相免密钥登录(Linux SSH 免密码登录)
修改两个机器的/etc/hosts文件,加入两个机器的信息,比如:
172.17.0.2 test1
172.17.0.3 test2
$ wget -c https://www.open-mpi.org/software/ompi/v1.10/downloads/openmpi-1.10.3.tar.gz
$ tar zxvf openmpi-1.10.3.tar.gz
$ cd openmpi-1.10.3
$ ./configure --prefix=/opt/openmpi
$ make
$ sudo make install
手动运行下面的命令
$ export PATH=$PATH:/opt/openmpi/bin
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/openmpi/lib/
并且将其写入~/.bashrc文件中,这样mpiexec在远程机器上运行的时候就会自动source环境了。
PATH=$PATH:/opt/openmpi/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/openmpi/lib/
export PATH LD_LIBRARY_PATH
$ cd examples (源代码目录)
$ make
$ mpirun -np 10 hello_c
$ mpirun -np 10 ring_c
$ mpirun -np 3 printenv
首先创建一个集群机器列表,并指定每个机器的slots数,比如文件名hostfile,内如如下:
test1 slots=2
test2 slots=2
运行mpi作业
$ mpiexec --hostfile hosts -np 4 hello_c
Hello, world, I am 1 of 4, (Open MPI v1.10.3, package: Open MPI jhadmin@test1 Distribution, ident: 1.10.3, repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 124)
Hello, world, I am 0 of 4, (Open MPI v1.10.3, package: Open MPI jhadmin@test1 Distribution, ident: 1.10.3, repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 124)
Hello, world, I am 2 of 4, (Open MPI v1.10.3, package: Open MPI jhadmin@test2 Distribution, ident: 1.10.3, repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 124)
Hello, world, I am 3 of 4, (Open MPI v1.10.3, package: Open MPI jhadmin@test2 Distribution, ident: 1.10.3, repo rev: v1.10.2-251-g9acf492, Jun 14, 2016, 124)
如果在运行命令“mpiexec –hostfile hosts -np 4 hello_c”出现下面错误的时候
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
可以使用下面命令来查看是不是环境变量没有设置对,开始我就是环境变量 PATH 和 LD_LIBRARY_PATH 设置有问题,才出现上面的错误。
mpiexec --hostfile hosts -np 4 printenv
检查~/.bashrc文件,指定正确的路径即可。
转载请以链接形式标明本文链接
本文链接:http://blog.csdn.net/kongxx/article/details/52227572