搭建mpi并行运算中遇到的问题与解决方案

搭建mpi并行运算中遇到的问题与解决方案

 

1,[root@localhost ~]# mpdtrace
configuration file /etc/mpd.conf is accessible by others
change permissions to allow read and write access only by you

解决:

[root@localhost ~]# chmod 600 /etc/mpd.conf

 

 

2,[root@localhost ~]# mpdboot -n 1 -f mpd.hosts
mpdboot_localhost.localdomain (handle_mpd_output 414): from mpd on localhost.localdomain, invalid port info:
no_port

解决:

是因为 mpd.conf 等文件权限问题造成的,需要设置为 600权限

 

3,[root@localhost ~]# mpdtrace
mpdroot: perror msg: No such file or directory
mpdroot: cannot connect to local mpd at: /tmp/mpd2.console_root
    probable cause:  no mpd daemon on this machine
    possible cause:  unix socket /tmp/mpd2.console_root has been removed
mpdtrace (__init__ 1204): forked process failed; status=255

解决:

mpdboot服务没有起来,mpdboot -n 1 -f mpd.hosts

 

4,在测试过程中,经常出现mpd进程无法与某个节点建立连接或者无法通信的问题,出现这种问题一是要检查该节点单独启动mpd是否成功,如果成功,则问题一般出现在防火墙的配置上

 

5,[root@localhost examples]# mpiexec -n 5 ./cpi
mpiexec_localhost.localdomain (mpiexec 392): no msg recvd from mpd when expecting ack of request
[root@localhost examples]# mpiexec -n 5 ./cpi
Process 3 of 5 is on localhost.localdomain
Process 4 of 5 is on localhost.localdomain
Process 0 of 5 is on localhost.localdomain
Process 1 of 5 is on localhost.localdomain
Process 2 of 5 is on localhost.localdomain
pi is approximately 3.1415926544231230, Error is 0.0000000008333298
wall clock time = 0.005338
[root@localhost examples]# 

解决:可能是资源忙之类的,有的时候正常有的时候异常


你可能感兴趣的:(socket,File,防火墙,Access,permissions,output)