openmpi报错:Node names must be composed of a combination of ascii letters,digits,dots,and hyphen char

今天在复现模型结果的时候,遇到了nodename包含下划线“_”导致open mpi无法运行的问题,报错信息如下所示:
openmpi报错:Node names must be composed of a combination of ascii letters,digits,dots,and hyphen char_第1张图片

------------------------------------------------------------------------
While trying to create a regular expression of the node names
used in this application, the regex parser has detected the
presence of an illegal character in the following node name:
node: abcd_12
Node names must be composed of a combination of ascii letters,
digits, dots, and the hyphen (‘-’) character. See the following
for an explanation:
https://en.wikipedia.org/wiki/Hostname
Please correct the error and try again.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
An internal error has occurred in ORTE:
[[30525,0],0] FORCE-TERMINATE AT (null):1 - error base/plm_base_launch_support.c(555)
This is something that should be reported to the developers.
--------------------------------------------------------------------------

根据报错提示,是因为nodename包含下划线“_”导致open mpi无法运行。上网搜了一圈解决方法,都是要修改hostname,感觉不太实际。
最后在官方github的 issues 9321 里找到了解决方法——将环境变量OMPI_MCA_regx设置为naive即可:

$ export OMPI_MCA_regx=naive

问题就此解决了!

另外,如果希望特定虚拟环境中始终满足这一条件,可提前设置好该环境中的变量

1)在当前环境的路径下的etc/conda/activate.d文件夹里,新建activate.sh

cd /root/miniconda3/envs//etc/conda/activate.d
vim activate.sh

写入环境变量:

ORIGINAL_OMPI_MCA_regx=$OMPI_MCA_regx
export OMPI_MCA_regx=naive

2)在当前环境的路径下的etc/conda/deactivate.d文件夹里,新建deactivate.sh

cd /root/miniconda3/envs//etc/conda/deactivate.d
vim deactivate.sh

重置环境变量:

export OMPI_MCA_regx=$ORIGINAL_OMPI_MCA_regx
unset ORIGINAL_OMPI_MCA_regx

你可能感兴趣的:(python,linux)