Installing PBS on redhat9.0

Pre-install Configuration

Configure RSH
Rsh needs to be configured to allow passwordless communication between the head node and the compute nodes. This is done by creating /etc/hosts.equiv on the head and compute nodes.
Head node /etc/hosts.equiv

#This file contains a list of all of the compute nodes hostnames
node01
node02
-
-
nodeXX

Compute node /etc/hosts.equiv
#This file contains the head node hostname
node00

Enable rsh, rlogin, and rexec
Change the "disable=yes" to "disable=no" in each of their respective xinetd scripts located in /etc/xinetd.d
# /sbin/service xinetd reload
Security note: Make sure you are on a private network, or that your firewall is properly configured so as to deny access from all untrusted IP addresses.


Test rsh
From the head node, use rsh to login to a compute node as a non-root user
$ rsh nodeXX
From the compute node, use rsh to login to the head node as a non-root user
$ rsh node00
If a password prompt appears in either case, rsh is not configured correctly.

Download OpenPBS
OpenPBS can be obtained for free from www.openpbs.org; registration is required. Download OpenPBS_2_3_16.tar.gz into a directory such as /home/download on the head node. RPMS are also available, but are not yet compatible with gcc 3.0+ that ships with recent version of Linux.
Installation
If you are using the pre-compiled RPMs, you may simply install the RPMs and skip this section. Note that the RPMs are incompatible with gcc 3.0, and may complain of a binary incompatibility. You must then compile OpenPBS from source as described below.
Untar PBS
# cd /home/download
# tar -xvzf OpenPBS_2_3_16.tar.gz
# cd OpenPBS_2_3_16/

Patch PBS files
Download the following patch file.
# cd /home/download/OpenPBS_2_3_16
# patch -p1 -b < pbs.patch

Compile PBS

Head Node
# mkdir /home/downlad/OpenPBS_2_3_16/head
# cd /home/downlad/OpenPBS_2_3_16/head
#
../configure  --prefix=/usr/pbs_home --set-server-home=/usr/pbs_home/spool --set-default-server=node00
# make
# make install
This disables the GUI and sets the installation directory to /var/spool/PBS instead of /usr/spool/PBS. This also sets the default PBS server to node00, which should be the hostname of the head node. The source is then compiled and the binaries are then installed.


Compute Nodes
# mkdir /home/downlad/OpenPBS_2_3_16/compute
# cd /home/downlad/OpenPBS_2_3_16/compute
# ../configure --disable-gui --set-server-home=/var/spool/PBS --disable-server --set-default-server=node00 --set-sched=no
# make
# rsh nodeXX 'cd /home/download/OpenPBS_2_3_16/compute; make install'
After the head node is successfully installed, the configuration script is run again in a separate directory for the compute nodes. This disables the GUI and sets the installation directory to /var/spool/PBS instead of /usr/spool/PBS. This also sets the default PBS server to node00, which should be the hostname of the head node. It also disables the server and scheduling processes of PBS on the compute nodes, since they are not necessary. The source is then compiled and installed using rsh.


Install documentation
# cd /home/download/OpenPBS_2_3_16/head/doc
# make install

Configure PBS
Create PBS node description file
On the head node, create the file /var/spool/PBS/server_priv/nodes.
#This file contains a list of all of the compute nodes hostnames
node01 np=1
node02 np=1
-
-
nodeXX np=1

where nodeXX corresponds to the hostname of the compute node and np corresponds to the number of processsors on the node.

Configure PBS mom
On the head node and on each compute node, create the file /var/spool/PBS/mom_priv/config.
#/var/spool/PBS/mom_priv/config
$logevent 0x0ff
$clienthost node00
$restricted node00

This causes all messages except for debugging messages to be logged and sets the primary server to node00. It also allows node00 to monitor OpenPBS.
On the head node and on each compute node, start PBS mom with
# /usr/local/sbin/pbs_mom


Configure PBS server
On the head node,
# /usr/local/sbin/pbs_server -t create
# qmgr

>c q workq
>s q workq queue_type=execution
>s q workq enabled=true
>s q workq started=true
>s s default_queue=workq
>s s scheduling=true
>s s query_other_jobs=true
>s s node_pack=false
>s s log_events=511
>s s scheduler_iteration=600
>s s resources_default.neednodes=1
>s s resources_default.nodect=1
>s s resources_default.nodes=1
>quit

This creates an execution queue called workq that is enabled and started. It is then declared the default queue for the server. Logging and scheduling are enabled on the server and node_pack is set to false. The default number of nodes is set to 1.
It is useful to backup the PBS server configuration file with
# qmgr -c "print server" > /var/spool/PBS/qmgr.conf
so that the PBS server could be restored with
# qmgr < /var/spool/PBS/qmgr.conf


Start PBS scheduler
On the head node, PBS scheduling needs to be started after the server configuration is complete. This is done by,
#/usr/local/sbin/pbs_sched

Enable PBS on startup
On the head node and on each compute node,
Create the script /etc/init.d/pbs.

Set PBS to automatically restart when the computer boots
# chkconfig pbs on

Restart PBS
On the head node and on each compute node, manually restart OpenPBS for all the configuration changes to take effect by using
#/sbin/service pbs stop
#/sbin/service pbs start

Testing PBS
After installing and configuring OpenPBS, testing should be done to verify that everything is working properly. In order to do this, a simple test script is created

#!/bin/sh
#testpbs
echo This is a test
echo Today is `date`
echo This is `hostname`
echo The current working directory is `pwd`
ls -alF /home
uptime

and then submitted to PBS using the qsub command as a non-root user.
$ qsub testpbs

After the job is executed, the output is stored in the directory from which the job was submitted. If errors occur or output is not received, check the log files for messages about the job (especially the precise name used by PBS for the headnode in the server_logs).


# more /var/spool/PBS/*_logs/*


qstat -a      查看当前的所有pbs提交的作业的状态

qdel ID 删除作业号为 ID的作业

pbsnodes -a 查看所有计算节点目前的运行情况,是否繁忙,是否空闲。

pbsnodes -a | grep ID 查看作业号ID的作业运行情况和运行节点。

pbsnodes -a | grep free 查看空闲的计算节点。




11.1 OpenPBS

Before the emergence of clusters, the Unix-based Network Queuing System (NQS) from NASA Ames Research Center was a commonly used batch-queuing system. With the emergence of parallel distributed system, NQS began to show its limitations. Consequently, Ames led an effort to develop requirements and specifications for a newer, cluster-compatible system. These requirements and specifications later became the basis for the IEEE 1003.2d POSIX standard. With NASA funding, PBS, a system conforming to those standards, was developed by Veridian in the early 1990s.

PBS is available in two forms桹penPBS or PBSPro. OpenPBS is the unsupported original open source version of PBS, while PBSPro is a newer commercial product. In 2003, PBSPro was acquired by Altair Engineering and is now marketed by Altair Grid Technologies, a subsidiary of Altair Engineering. The web site for OpenPBS is http://www.openpbs.org; the web site for PBSPro is http://www.pbspro.com. Although much of the following will also apply to PBSPro, the remainder of this chapter describes OpenPBS, which is often referred to simply as PBS. However, if you have the resources to purchase software, it is well worth looking into PBSPro. Academic grants have been available in the past, so if you are eligible, this is worth looking into as well.

As an unsupported product, OpenPBS has its problems. Of the software described in this book, it was, for me, the most difficult to install. In my opinion, it is easier to install OSCAR, which has OpenPBS as a component, or Rocks along with the PBS roll than it is to install just OpenPBS. With this warning in mind, we'll look at a typical installation later in this chapter.
11.1.1 Architecture

Before we install PBS, it is helpful to describe its architecture. PBS uses a client-server model and is organized as a set of user-level commands that interact with three system-level daemons. Jobs are submitted using the user-level commands and managed by the daemons. PBS also includes an API.

The pbs_server daemon, the job server, runs on the server system and is the heart of the PBS system. It provides basic batch services such as receiving and creating batch jobs, modifying the jobs, protecting jobs against crashes, and running the batch jobs. User commands and the other daemons communicate with the pbs_server over the network using TCP. The user commands need not be installed on the server.

The job server manages one or more queues. (Despite the name, queues are not restricted to first-in, first-out scheduling.) A scheduled job waiting to be run or a job that is actually running is said to be a member of its queue. The job server supports two types of queues, execution and routing. A job in an execution queue is waiting to execute while a job in a routing queue is waiting to be routed to a new destination for execution.

The pbs_mom daemon executes the individual batch jobs. This job executor daemon is often called the MOM because it is the "mother" of all executing jobs and must run on every system within the cluster. It creates an execution environment that is as nearly identical to the user's session as possible. MOM is also responsible for returning the job's output to the user.

The final daemon, pbs_sched, implements the cluster's job-scheduling policy. As such, it communicates with the pbs_server and pbs_mom daemons to match available jobs with available resources. By default, a first-in, first-out scheduling policy is used, but you are free to set your own policies. The scheduler is highly extensible.

PBS provides both a GUI interface as well as 1003.2d-compliant command-line utilities. These commands fall into three categories: management, operator, and user commands. Management and operator commands are usually restricted commands. The commands are used to submit, modify, delete, and monitor batch jobs.
11.1.2 Installing OpenPBS

While detailed installation directions can be found in the PBS Administrator Guide, there are enough "gotchas" that it is worth going over the process in some detail. Before you begin, be sure you look over the Administrator Guide as well. Between the guide and this chapter, you should be able to overcome most obstacles.

Before starting with the installation proper, there are a couple of things you need to check. As noted, PBS provides both command-line utilities and a graphical interface. The graphical interface requires Tcl/Tk 8.0 or later, so if you want to use it, make sure Tcl/Tk is installed. You'll want to install Tcl/Tk before you install PBS. For a Red Hat installation, you can install Tcl/Tk from the packages supplied with the operating system. For more information on Tcl/Tk, visit the web site http://www.scriptics.com/. In order to build the GUI, you'll also need the X11 development packages, which Red Hat users can install from the supplied RPMs.

The first step in the installation proper is to download the software. Go to the OpenPBS web site (http://www-unix.mcs.anl.gov/openpbs/) and follow the links to the download page. The first time through, you will be redirected to a registration page. With registration, you will receive by email an account name and password that you can use to access the actual download page. Since you have to wait for approval before you receive the account information, you'll want to plan ahead and register a couple of days before you plan to download and install the software. Making your way through the registration process is a little annoying because it keeps pushing the commercial product, but it is straightforward and won't take more than a few minutes.

Once you reach the download page, you'll have the choice of downloading a pair of RPMs or the patched source code. The first RPM contains the full PBS distribution and is used to set up the server, and the second contains just the software needed by the client and is used to set up compute nodes within a cluster. While RPMs might seem the easiest way to go, the available RPMs are based on an older version of Tcl/Tk (Version 8.0). So unless you want to backpedal梚.e., track down and install these older packages, a nontrivial task梚nstalling the source is preferable. That's what's described here.

Download the source and move it to your directory of choice. With a typical installation, you'll end up with three directory trees梩he source tree, the installation tree, and the working directory tree. In this example, I'm setting up the source tree in the directory /usr/local/src. Once you have the source package where you want it, unpack the code.

# gunzip OpenPBS_2_3_16.tar.gz

# tar -vxpf OpenPBS_2_3_16.tar


When untarring the package, use the -p option to preserve permissions bits.

Since the OpenPBS code is no longer supported, it is somewhat brittle. Before you can compile the code, you will need to apply some patches. What you install will depend on your configuration, so plan to spend some time on the Internet: the OpenPBS URL given above is a good place to start. For Red Hat Linux 9.0, start by downloading the scaling patch from http://www-unix.mcs.anl.gov/openpbs/ and the errno and gcc patches from http://bellatrix.pcl.ox.ac.uk/~ben/pbs/. (Working out the details of what you need is the annoying side of installing OpenPBS.) Once you have the patches you want, install them.

# cp openpbs-gcc32.patch /usr/local/src/OpenPBS_2_3_16/

# cp openpbs-errno.patch /usr/local/src/OpenPBS_2_3_16/

# cp ncsa_scaling.patch /usr/local/src/OpenPBS_2_3_16/

# cd /usr/local/src/OpenPBS_2_3_16/

# patch -p1 -b < openpbs-gcc32.patch

patching file buildutils/exclude_script

# patch -p1 -b < openpbs-errno.patch

patching file src/lib/Liblog/pbs_log.c

patching file src/scheduler.basl/af_resmom.c

# patch -p1 -b < ncsa_scaling.patch

patching file src/include/acct.h

patching file src/include/cmds.h

patching file src/include/pbs_ifl.h

patching file src/include/qmgr.h

patching file src/include/server_limits.h


The scaling patch changes built-in limits that prevent OpenPBS from working with larger clusters. The other patches correct problems resulting from recent changes to the gcc complier.[1]

[1] Even with the patches, I found it necessary to manually edit the file srv_connect.c, adding the line #include with the other #include lines in the file. If you have this problem, you'll know because make will fail when referencing this file. Just add the line and remake the file.

As noted, you'll want to keep the installation directory separate from the source tree, so create a new directory for PBS. /usr/local/OpenPBS is a likely choice. Change to this directory and run configure, make, make install, and make clean from it.

# mkdir /usr/local/OpenPBS

# cd /usr/local/OpenPBS

# /usr/local/src/OpenPBS_2_3_16/configure \

> --set-default-server=fanny --enable-docs --with-scp

...

# cd /usr/local/src/OpenPBS_2_3_16/

# make

...

# /usr/local/src/OpenPBS

# make install

...

# make clean

...


In this example, the configuration options set fanny as the server, create the documentation, and use scp (SSH secure copy program) when moving files between remote hosts. Normally, you'll create the documentation only on the server. The Administrator Guide contains several pages of additional options.

By default, the procedure builds all the software. For the compute nodes, this really isn't necessary since all you need is pbs_mom on these machines. Thus, there are several alternatives that you might want to consider when setting up the clients. You could just go ahead and build everything like you did for the server, or you could use different build options to restrict what is built. For example, the option --disable-server prevents the pbs_server daemon from being built. Or you could build and then install just pbs_mom and the files it needs. To do this, change to the MOM subdirectory, in this example /usr/local/OpenPBS/src/resmom, and run make install to install just MOM.

# cd /usr/local/OpenPBS/src/resmom

# make install

...


Yet another possibility is to use NFS to mount the appropriate directories on the client machines. The Administrator Guide outlines these alternatives but doesn't provide many details. Whatever your approach, you'll need pbs_mom on every compute node.

The make install step will create the /usr/spool/PBS working directory, and will install the user commands in /usr/local/bin and the daemons and administrative commands in /usr/local/sbin. make clean removes unneeded files.
11.1.3 Configuring PBS

Before you can use PBS, you'll need to create or edit the appropriate configuration files, located in the working directory, e.g., /usr/spool/PBS, or its subdirectories. First, the server needs the node file, a file listing the machines it will communicate with. This file provides the list of nodes used at startup. (This list can be altered dynamically with the qmgr command.) In the subdirectory server_priv, create the file nodes with the editor of your choice. The nodes file should have one entry per line with the names of the machines in your cluster. (This file can contain additional information, but this is enough to get you started.) If this file does not exist, the server will know only about itself.

MOM will need the configuration file config, located in the subdirectory mom_priv. At a minimum, you need an entry to start logging and an entry to identity the server to MOM. For example, your file might look something like this:

$logevent 0x1ff

$clienthost fanny


The argument to $logevent is a mask that determines what is logged. A value of 0X0ff will log all events excluding debug messages, while a value of 0X1ff will log all events including debug messages. You'll need this file on every machine. There are a number of other options, such as creating an access list.

Finally, you'll want to create a default_server file in the working directory with the fully qualified domain name of the machine running the server daemon.

PBS uses ports 15001-15004 by default, so it is essential that your firewall doesn't block these ports. These can be changed by editing the /etc/services file. A full list of services and ports can be found in the Administrator Guide (along with other configuration options). If you decide to change ports, it is essential that you do this consistently across your cluster!

Once you have the configuration files in place, the next step is to start the appropriate daemons, which must be started as root. The first time through, you'll want to start these manually. Once you are convinced that everything is working the way you want, configure the daemons to start automatically when the systems boot by adding them to the appropriate startup file, such as /etc/rc.d/rc.local. All three daemons must be started on the server, but the pbs_mom is the only daemon needed on the compute nodes. It is best to start pbs_mom before you start the pbs_server so that it can respond to the server's polling.

Typically, no options are needed for pbs_mom. The first time (and only the first time) you run pbs_server, start it with the option -t create.

# pbs_server -t create


This option is used to create a new server database. Unlike pbs_mom and pbs_sched, pbs_server can be configured dynamically after it has been started.

The options to pbs_sched will depend on your site's scheduling policies. For the default FIFO scheduler, no options are required. For a more detailed discussion of command-line options, see the manpages for each daemon.
11.1.4 Managing PBS

We'll begin by looking at the command-line utilities first since the GUI may not always be available. Once you have mastered these commands, using the GUI should be straightforward. From a manager's perspective, the first command you'll want to become familiar with is qmgr, the queue management command. qmgr is used to create job queues and manage their properties. It is also used to manage nodes and servers providing an interface to the batch system. In this section we'll look at a few basic examples rather than try to be exhaustive.

First, identify the pbs_server managers, i.e., the users who are allowed to reconfigure the batch system. This is generally a one-time task. (Keep in mind that not all commands require administrative privileges. Subcommands such as the list and print can be executed by all users.) Run the qmgr command as follows, substituting your username:

# qmgr

Max open servers: 4

Qmgr: set server [email protected]

Qmgr: quit


You can specify multiple managers by adding their names to the end of the command, separated by commas. Once done, you'll no longer need root privileges to manage PBS.

Your next task will be to create a queue. Let's look at an example.

[sloanjd@fanny PBS]$ qmgr

Max open servers: 4

Qmgr: create queue workqueue

Qmgr: set queue workqueue queue_type = execution

Qmgr: set queue workqueue resources_max.cput = 24:00:00

Qmgr: set queue workqueue resources_min.cput = 00:00:01

Qmgr: set queue workqueue enabled = true

Qmgr: set queue workqueue started = true

Qmgr: set server scheduling = true

Qmgr: set server default_queue = workqueue

Qmgr: quit


In this example we have created a new queue named workqueue. We have limited CPU time to between 1 second and 24 hours. The queue has been enabled, started, and set as the default queue for the server, which must have at least one queue defined. All queues must have a type, be enabled, and be started.

As you can see from the example, the general form of a qmgr command line is a command (active, create, delete, set, unset, list, or print) followed by a target (server, queue, or node) followed by an attribute assignment. These keywords can be abbreviated as long as there is no ambiguity. In the first example in this section, we set a server attribute. In the second example, the target was the queue that we were creating for most of the commands.

To examine the configuration of the server, use the command

Qmgr: print server


This can be used to save the configuration you are using. Use the command

# qmgr -c "print server" > server.config


Note, that with the -c flag, qmgr commands can be entered on a single line. To re-create the queue at a later time, use the command

# qmgr < server.config


This can save a lot of typing or can be automated if needed. Other actions are described in the documentation.

Another useful command is pbsnodes, which lists the status of the nodes on your cluster.

[sloanjd@amy sloanjd]$ pbsnodes -a

oscarnode1.oscardomain

state = free

np = 1

properties = all

ntype = cluster



oscarnode2.oscardomain

state = free

np = 1

properties = all

ntype = cluster

...


On a large cluster, that can create a lot of output.
11.1.5 Using PBS

From the user's perspective, the place to start is the qsub command, which submits jobs. The only jobs that the qsub accepts are scripts, so you'll need to package your tasks appropriately. Here is a simple example script:

#!/bin/sh

#PBS -N demo

#PBS -o demo.txt

#PBS -e demo.txt

#PBS -q workq

#PBS -l mem=100mb



mpiexec -machinefile /etc/myhosts -np 4 /home/sloanjd/area/area


The first line specified the shell to use in interpreting the script, while the next few lines starting with #PBS are directives that are passed to PBS. The first names the job, the next two specify where output and error output go, the next to last identifies the queue that is used, and the last lists a resource that will be needed, in this case 100 MB of memory. The blank line signals the end of PBS directives. Lines that follow the blank line indicate the actual job.

Once you have created the batch script for your job, the qsub command is used to submit the job.

[sloanjd@amy area]$ qsub pbsdemo.sh

11.amy


When run, qsub returns the job identifier as shown. A number of different options are available, both as command-line arguments to qsub or as directives that can be included in the script. See the qsub (1B) manpage for more details.

There are several things you should be aware of when using qsub. First, as noted, it expects a script. Next, the target script cannot take any command-line arguments. Finally, the job is launched on one node. The script must ensure that any parallel processes are then launched on other nodes as needed.

In addition to qsub, there are a number of other useful commands available to the general user. The commands qstat and qdel can be used to manage jobs. In this example, qstat is used to determine what is on the queue:

[sloanjd@amy area]$ qstat

Job id Name User Time Use S Queue

---------------- ---------------- ---------------- -------- - -----

11.amy pbsdemo sloanjd 0 Q workq

12.amy pbsdemo sloanjd 0 Q workq


qdel is used to delete jobs as shown.

[sloanjd@amy area]$ qdel 11.amy

[sloanjd@amy area]$ qstat

Job id Name User Time Use S Queue

---------------- ---------------- ---------------- -------- - -----

12.amy pbsdemo sloanjd 0 Q workq


qstat can be called with the job identifier to get more information about a particular job or with the -s option to get more details.

A few of the more useful ones include the following:


qalter

This is used to modify the attributes of an existing job.

qhold

This is used to place a hold on a job.

qmove

This is used to move a job from one queue to another.

qorder

This is used to change the order of two jobs.

qrun

This is used to force a server to start a job.

If you start with the qsub (1B) manpage, other available commands are listed in the "See Also" section.
[img]http://book.opensourceproject.org.cn/enterprise/cluster/highplinux/opensource/0596005709/images/0596005709/figs/hplc_1101.gif[/img]

Figure 11-2. xpbsmon

[img]http://book.opensourceproject.org.cn/enterprise/cluster/highplinux/opensource/0596005709/images/0596005709/figs/hplc_1102.gif[/img]

11.1.6 PBS's GUI

PBS provides two GUIs for queue management. The command xpbs will start a general interface. If you need to do administrative tasks, you should include the argument -admin. Figure 11-1 shows the xpbs GUI with the -admin option. Without this option, the general appearance is the same, but a number of buttons are missing. You can terminate a server; start, stop, enable, or disable a queue; or run or rerun a job. To monitor nodes in your cluster, you can use the xpbsmon command, shown for a few machines in Figure 11-2.
11.1.7 Maui Scheduler

If you need to go beyond the schedulers supplied with PBS, you should consider installing Maui. In a sense, Maui picks up where PBS leaves off. It is an external scheduler梩hat is, it does not include a resource manager. Rather, it can be used in conjunction with a resource manager such as PBS to extend the resource manager's capabilities. In addition to PBS, Maui works with a number of other resource managers.

Maui controls how, when, and where jobs will be run and can be described as a policy engine. When used correctly, it can provide extremely high system utilization and should be considered for any large or heavily utilized cluster that needs to optimize throughput. Maui provides a number of very advanced scheduling options. Administration is through the master configuration file maui.cfg and through either a text-based or a web-based interface.

Maui is installed by default as part of OSCAR and Rocks. For the most recent version of Maui or for further documentation, you should visit the Maui web site, http://www.supercluster.org.

你可能感兴趣的:(PBS)