Wien2K安装手册及benchmark v1.0

 

一、            背景介绍

简介:

用密度泛函理论计算固体的电子结构。它基于键结构计算最准确的方案——完全势能(线性)增广平面波((L)APW)+局域轨道(lo)方法。在密度泛函中可以使用局域(自旋)密度近似(LDA)或广义梯度近似(GGA)WIEN 2000使用全电子方案,包含相对论影响。

功能:

计算固体特性。键能和态密度,电子密度和自旋密度,X射线结构因子,Baders的“分子中的原子”概念,总能量,力,平衡结构,结构优化,分子动力学,电场梯度,异构体位移,超精细场,自旋极化(铁磁性和反铁磁性结构),自旋-轨道耦合,X射线发射和吸收谱,电子能量损失谱计算固体的光学特性费米表面LDAGGAmeta-GGALDA+U,轨道极化中心对称和非中心对称晶格,内置230个空间群图形用户界面和用户指南友好的用户环境W2web (WIEN to WEB)可以很容易的产生和修改输入文件。它还能帮助用户执行各种任务(如电子密度,态密度,等)。

平台:

unix /linux

 

二、            软件的安装设置

1、硬件环境

Shanghai/Suse 10u2

 

2、软件版本

Verwien2k09

 

3、安装Intel 编译器

ifort/icc

Ver11.083

 

4、安装IntelMKL

Ver10.1.2.024

 

5、安装mpich v1.2.7

./configure -c++=icpc -cc=icc -f77=ifort -f90=ifort --prefix=/home/soft/mpi/mpich-1.2.7-intel

make

make install

 

6、设置环境变量

vi ~/.bashrc

添加如下:

##############MPICH###########

export PATH=/home/soft/mpi/mpich-1.2.7-intel/bin:$PATH

################intel compiler###################

. /home/soft/intel/Compiler/11.0/083/bin/intel64/ifortvars_intel64.sh

. /home/soft/intel/Compiler/11.0/083/bin/intel64/iccvars_intel64.sh

###############intel mkl###################

export LD_LIBRARY_PATH=/home/soft/intel/mkl/10.1.2.024/lib/em64t/:$LD_LIBRARY_PATH

 

7、安装fftw

tar zxf fftw-2.1.5.tar.gz

cd fftw-2.1.5/

export F77=ifort

export CC=icc

./configure --prefix=/home/soft/mathlib/fftwv215-mpich --enable-mpi

make

make install

 

8、创建编译目录

进入安装用户目录

su - mjhe

mkdir ~/WIEN2k_09

cp WIEN_2k.tar ~/WIEN2k_09

 

9、将压缩包解开

cd ~/WIEN2k_09

tar xf WIEN2k_09.tar

./expand_lapw

 

10、            编译

./siteconfig_lapw

其中几个编译参数需要修改: (可以参考如下)

specify a system

K    Linux (Intel ifort 10.1 compiler + mkl 10.0 )

specify compiler

Current selection:   ifort

Current selection:   icc

specify compiler options, BLAS and LAPACK

Current settings:

 O   Compiler options:        -FR -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML -traceback

 L   Linker Flags:            $(FOPT) -L/home/soft/intel/mkl/10.1.2.024/lib/em64t/ -pthread -i-static

 P   Preprocessor flags          '-DParallel'

mkl的库用静态的:

R   R_LIB (LAPACK+BLAS):     /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_lapack.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libguide.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_core.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_em64t.a

configure Parallel execution

Shared Memory Architecture? (y/n):n

Remote shell (default is ssh) = ssh

Do you have MPI and Scalapack installed and intend to run

finegrained parallel? (This is usefull only for BIG cases)!

   (y/n) n

Current selection: mpiifort

Current settings:

采用静态库

RP  RP_LIB(SCALAPACK+PBLAS): -lmkl_intel_lp64 /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_scalapack_lp64.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_sequential.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_blacs_lp64.a /home/soft/mathlib/fftwv215-mpich/lib/libfftw_mpi.a /home/soft/mathlib/fftwv215-mpich/lib/libfftw.a -lmkl /home/soft/intel/mkl/10.1.2.024/lib/em64t/libguide.a

//

RP  RP_LIB(SCALAPACK+PBLAS): -lmkl_intel_lp64 /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_scalapack_lp64.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_sequential.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_blacs_lp64.a -L/data1/soft/lib/lib/ -lfftw_mpi -lfftw -lmkl /data1/soft/intel/mkl/10.0.3.020/lib/em64t/libguide.a

FP  FPOPT(par.comp.options): $(FOPT)

MP  MPIRUN commando        : mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_

Dimension Parameters

该部分可以采用默认值,也可以设置为(4GB以上内存)

PARAMETER          (NMATMAX=   30000)

PARAMETER          (NUME=   1000)

进入编译部分:

Compile/Recompile

A   Compile all programs (suggested)

主要在编译mpi并行版本的5个可执行文件时会出错,因此编译后需要检查以下文件是否存在:

./SRC_lapw0/lapw0_mpi

./SRC_lapw1/lapw1_mpi

./SRC_lapw1/lapw1c_mpi

./SRC_lapw2/lapw2_mpi

./SRC_lapw2/lapw2c_mpi

 

 

11、            安装后设置

./userconfig_lapw

editor shall be: vi

其余都回车

修改.bashrc,注释以下这行:

#ulimit -s unlimited

修改parallel_options

setenv WIEN_MPIRUN “mpirun -machinefile _HOSTS_ -np _NP_ _EXEC_”

 

12、            配置web界面

root用户打开apache服务

service apache2 start

在普通用户下执行

w2web

将打开7890端口作为wien2kweb界面

 

13、            算例测试

进行串行计算:

以系统自带算例TiC为例:

cd TiC

mkdir TiC

cp ../TiC.struct .

生成原子信息:

instgen_lapw

初始化算例:

init_lapw –b

计算:

run_lapw

可以看到程序的输出结果在*.output中,如有错误可以在TiC.dayfile中查询。

进行并行计算:

测试并行环境是否设置:

testpara_lapw

测试算例计算状态:

testpara1_lapw

testpara2_lapw

根据.machines文件不同决定进行k点或mpi并行计算:

K点:

granularity:1

1:node31:1

1:node31:1

1:node32:1

1:node32:1

lapw0:node31:2 node32:2

extrafine:1

mpi

granularity:1

1:node31:2

1:node32:2

lapw0:node31:2 node32:2

extrafine:1

 

计算:

run_lapw -p

 

14、            采用作业调度提交作业

cat wien2k.pbs

###########################################################################

#                                                                         #

# Script for submitting parallel wien2k_09 jobs to Dawning cluster.                            #

#                                                                         #

###########################################################################

###########################################################################

# Lines that begin with #PBS are PBS directives (not comments).

# True comments begin with "# " (i,e., # followed by a space).

###########################################################################

#PBS -S /bin/bash

#PBS -N TiO2

#PBS -j oe

#PBS -l nodes=1:ppn=8

#PBS -V

#############################################################################

#  -S: shell the job will run under

#  -o: name of the queue error filename

#  -j: merges stdout and stderr to the same file

#  -l: resources required by the job: number of nodes and processors per node

#  -l: resources required by the job: maximun job time length

#############################################################################

 

#########parallel mode is mpi/kpoint############

PARALLEL=mpi           //表示采用mpi并行或k点并行

echo $PARALLEL

################################################

 

NP=`cat ${PBS_NODEFILE} | wc -l`

NODE_NUM=`cat $PBS_NODEFILE|uniq  |wc -l`

NP_PER_NODE=`expr $NP / $NODE_NUM`

username=`whoami`

 

export WIENROOT=/home/users/mjhe/wien2k_09/

export PATH=$PATH:$WIENROOT:.

WIEN2K_RUNDIR=/scratch/${username}.${PBS_JOBID}

export SCRATCH=${WIEN2K_RUNDIR}

 

#creat scratch dir

 

if [ ! -a $WIEN2K_RUNDIR ]; then

   echo "Scratch directory $WIEN2K_RUNDIR created."

   mkdir -p $WIEN2K_RUNDIR

fi

 

cd $PBS_O_WORKDIR

 

###############creating .machines################

 

case $PARALLEL in

 

mpi)

 

        echo "granularity:1" >.machines

 

        for i in `cat $PBS_NODEFILE |uniq `

        do

              echo "1:"$i":"$NP_PER_NODE >> .machines

        done

 

        printf "lapw0:">> .machines

#####lapw0 mpi并行#############

        for i in `cat ${PBS_NODEFILE}|uniq`

        do

              printf $i:$NP_PER_NODE" " >>.machines

        done

#################################

####lapw0mpi并行 报错的算例用以下 mpi error lapw0########

#        printf `cat ${PBS_NODEFILE}|uniq|head -1`:1>>.machines

#############end#################

 

        printf "/n" >>.machines

 

        echo "extrafine:1">>.machines

 

        ;;

kpoint)

 

        echo "granularity:1" >.machines

 

        for i in `cat $PBS_NODEFILE`

        do

              echo "1:"$i":"1 >> .machines

        done

 

        printf "lapw0:">> .machines

 

#####lapw0 mpi并行#############

        for i in `cat ${PBS_NODEFILE}|uniq`

        do

              printf $i:$NP_PER_NODE" " >>.machines

        done

#################################

####lapw0mpi并行 报错的算例用以下 mpi error lapw0########

#        printf `cat ${PBS_NODEFILE}|uniq|head -1`:1>>.machines

#############end#################

 

        printf "/n" >>.machines

 

        echo "extrafine:1">>.machines

 

        ;;

esac

 

#################end creating####################

 

####### Run the parallel executable "WIEN2K"#########

 

instgen_lapw

init_lapw -b

 

clean -s

echo "##################start time is `date`########################"

 

run_lapw -p

 

echo "###################end time is `date`########################"

 

rm -rf $WIEN2K_RUNDIR

 

########################END########################

 

一般需要修改的地方已用红字标出

该脚本可以实现算例的初始化,必须在存在*.struct的前提下进行。

 

 

15、            性能benchmark

CB65

Shanghai 238216GB 147GB SAS

1000Gb/mpich v1.2.7

TiO2算例:

NMATMAX=30000

2进程k点,mpi并行lapw0k点并行lapw1lapw2模块

4m44s

4进程k点,mpi并行lapw0k点并行lapw1lapw2模块

4m30s

8进程k点,mpi并行lapw0k点并行lapw1lapw2模块

6m29s

2进程mpimpi并行lapw0lapw1lapw2模块

7m53s

4进程mpimpi并行lapw0lapw1lapw2模块

6m56s

8进程mpimpi并行lapw0lapw1lapw2模块

9m5s

 

标准测试算例:

官方提供的测试算例:

串行:

test_case

export OMP_NUM_THREADS=1

time x lapw1 –c

SUM OF WALL CLOCK TIMES:    135.0 (INIT =      1.0 + K-POINTS =    133.9)

export OMP_NUM_THREADS=4

time x lapw1 –c

SUM OF WALL CLOCK TIMES:     62.0 (INIT =      1.0 + K-POINTS =     61.0)

export OMP_NUM_THREADS=8

time x lapw1 –c

SUM OF WALL CLOCK TIMES:     56.2 (INIT =      1.0 + K-POINTS =     55.2)

 

并行:

time x lapw1 –p

test_case

2 kpoint

test_case.output1: SUM OF WALL CLOCK TIMES:   62.0 (INIT =      1.0 + K-POINTS =     61.0)

test_case.output1_1: SUM OF WALL CLOCK TIMES:       138.5 (INIT =      1.0 + K-POINTS =    137.5)

4 kpoint

test_case.output1: SUM OF WALL CLOCK TIMES:         62.0 (INIT =      1.0 + K-POINTS =     61.0)

test_case.output1_1: SUM OF WALL CLOCK TIMES:       134.9 (INIT =      1.0 + K-POINTS =    133.9)

 

mpi-benchmark

2process

mpi-benchmark.output1_1:     TIME HAMILT (CPU)  =   134.1, HNS = 116.4, HORB =0.0, DIAG=697.5

mpi-benchmark.output1_1:     TOTAL CPU   TIME:    950.0 (INIT =      1.9 + K-POINTS =    948.1)

mpi-benchmark.output1_1:    SUM OF WALL CLOCK TIMES: 1138.9 (INIT =2.2 + K-POINTS =1136.7)

4process

mpi-benchmark.output1_1:     TIME HAMILT (CPU)  =     67.8, HNS =   70.5, HORB =       0.0, DIAG =   420.6

mpi-benchmark.output1_1:    TOTAL CPU   TIME:    560.7 (INIT =      1.8 + K-POINTS =    558.9)

mpi-benchmark.output1_1:  SUM OF WALL CLOCK TIMES:   643.2 (INIT = 2.2 + K-POINTS =    640.9)

8process

mpi-benchmark.output1_1:     TIME HAMILT (CPU)  =   40.4, HNS = 44.9, HORB =       0.0, DIAG =   422.0

mpi-benchmark.output1_1:     TOTAL CPU   TIME:    509.3 (INIT =      1.9 + K-POINTS =    507.4)

mpi-benchmark.output1_1:    SUM OF WALL CLOCK TIMES: 614.3 (INIT = 2.2 + K-POINTS =    612.0)

16process

mpi-benchmark.output1_1:     TIME HAMILT (CPU)  =   22.6, HNS =   32.5, HORB =       0.0, DIAG =   140.5

mpi-benchmark.output1_1:     TOTAL CPU       TIME:  197.5 (INIT = 1.9 + K-POINTS =    195.7)

mpi-benchmark.output1_1:     SUM OF WALL CLOCK TIMES: 1190.0 (INIT =2.8 + K-POINTS =1187.2)

 

可以用grep TIME *output1* 显示计算时间

 

16、            其他

 

三、            Troubleshooting

1、需要在所有计算节点建立本地缓存目录/scratch

mkdir /scratch

chmod 777 /scratch

 

2、每次进行计算时需要将算例先清空、重做初始化

 

3、其他

 

四、            其他

1         本文命令、代码和超链接采用斜体五号字表示

2        Reference

2.1   User’sGuide,February5,2009

2.2 http://www.wien2k.at/reg_user/benchmark/

 

 

你可能感兴趣的:(科学计算软件安装调试优化)