Augusdi

Managing Jobs

Knowledge Center Contents Previous Next Index

Managing Jobs

Contents

Understanding Job States

View Job Information

Changing Job Order Within Queues

Switch Jobs from One Queue to Another

Forcing Job Execution

Suspending and Resuming Jobs

Killing Jobs

Sending a Signal to a Job

Using Job Groups

Handling Job Exceptions

Understanding Job States

The bjobs command displays the current state of the job.

Normal job states

Most jobs enter only three states:

Job state

Description

PEND

Waiting in a queue for scheduling and dispatch

RUN

Dispatched to a host and running

DONE

Finished normally with a zero exit value

Suspended job states

If a job is suspended, it has three states:

Job state

Description

PSUSP

Suspended by its owner or the LSF administrator while in PEND state

USUSP

Suspended by its owner or the LSF administrator after being dispatched

SSUSP

Suspended by the LSF system after being dispatched

State transitions

A job goes through a series of state transitions until it eventually completes its task, fails, or is terminated. The possible states of a job during its life cycle are shown in the diagram.

Pending jobs

A job remains pending until all conditions for its execution are met. Some of the conditions are:

Start time specified by the user when the job is submitted

Load conditions on qualified hosts

Dispatch windows during which the queue can dispatch and qualified hosts can accept jobs

Run windows during which jobs from the queue can run

Limits on the number of job slots configured for a queue, a host, or a user

Relative priority to other users and jobs

Availability of the specified resources

Job dependency and pre-execution conditions

Maximum pending job threshold

If the user or user group submitting the job has reached the pending job threshold as specified by MAX_PEND_JOBS (either in the User section of lsb.users, or cluster-wide in lsb.params), LSF will reject any further job submission requests sent by that user or user group. The system will continue to send the job submission requests with the interval specified by SUB_TRY_INTERVAL in lsb.params until it has made a number of attempts equal to the LSB_NTRIES environment variable. If LSB_NTRIES is undefined and LSF rejects the job submission request, the system will continue to send the job submission requests indefinitely as the default behavior.

Suspended jobs

A job can be suspended at any time. A job can be suspended by its owner, by the LSF administrator, by the root user (superuser), or by LSF.

After a job has been dispatched and started on a host, it can be suspended by LSF. When a job is running, LSF periodically checks the load level on the execution host. If any load index is beyond either its per-host or its per-queue suspending conditions, the lowest priority batch job on that host is suspended.

If the load on the execution host or hosts becomes too high, batch jobs could be interfering among themselves or could be interfering with interactive jobs. In either case, some jobs should be suspended to maximize host performance or to guarantee interactive response time.

LSF suspends jobs according to the priority of the job's queue. When a host is busy, LSF suspends lower priority jobs first unless the scheduling policy associated with the job dictates otherwise.

Jobs are also suspended by the system if the job queue has a run window and the current time goes outside the run window.

A system-suspended job can later be resumed by LSF if the load condition on the execution hosts falls low enough or when the closed run window of the queue opens again.

WAIT state (chunk jobs)

If you have configured chunk job queues, members of a chunk job that are waiting to run are displayed as WAIT by bjobs. Any jobs in WAIT status are included in the count of pending jobs by bqueues and busers, even though the entire chunk job has been dispatched and occupies a job slot. The bhosts command shows the single job slot occupied by the entire chunk job in the number of jobs shown in the NJOBS column.

You can switch (bswitch) or migrate (bmig) a chunk job member in WAIT state to another queue.

See Chapter 32, "Chunk Job Dispatch" for more information about chunk jobs.

Exited jobs

An exited job ended with a non-zero exit status.

A job might terminate abnormally for various reasons. Job termination can happen from any state. An abnormally terminated job goes into EXIT state. The situations where a job terminates abnormally include:

The job is cancelled by its owner or the LSF administrator while pending, or after being dispatched to a host.

The job is not able to be dispatched before it reaches its termination deadline set by bsub -t, and thus is terminated by LSF.

The job fails to start successfully. For example, the wrong executable is specified by the user when the job is submitted.

The application exits with a non-zero exit code.

You can configure hosts so that LSF detects an abnormally high rate of job exit from a host. See Handling Host-level Job Exceptions for more information.

Post-execution states

Some jobs may not be considered complete until some post-job processing is performed. For example, a job may need to exit from a post-execution job script, clean up job files, or transfer job output after the job completes.

The DONE or EXIT job states do not indicate whether post-processing is complete, so jobs that depend on processing may start prematurely. Use the post_done and post_err keywords on the bsub -w command to specify job dependency conditions for job post-processing. The corresponding job states POST_DONE and POST_ERR indicate the state of the post-processing.

After the job completes, you cannot perform any job control on the post-processing. Post-processing exit codes are not reported to LSF.

See Chapter 38, "Pre-Execution and Post-Execution Commands" for more information.

View Job Information

The bjobs command is used to display job information. By default, bjobs displays information for the user who invoked the command. For more information about bjobs, see the LSF Reference and the bjobs(1) man page.

View all jobs for all users
Run bjobs -u all to display all jobs for all users.

Job information is displayed in the following order:
Running jobs

Pending jobs in the order in which they are scheduled

Jobs in high-priority queues are listed before those in lower-priority queues

For example:
bjobs -u all 
JOBID   USER    STAT    QUEUE     FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME 
1004    user1   RUN     short     hostA       hostA       job0       Dec 16 09:23 
1235    user3   PEND    priority  hostM                   job1       Dec 11 13:55 
1234    user2   SSUSP   normal    hostD       hostM       job3       Dec 11 10:09 
1250    user1   PEND    short     hostA                   job4       Dec 11 13:59 
View jobs for specific users
Run bjobs -u user_name to display jobs for a specific user:
bjobs -u user1 
JOBID   USER    STAT    QUEUE     FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME 
2225    user1   USUSP   normal    hostA                   job1       Nov 16 11:55 
2226    user1   PSUSP   normal    hostA                   job2       Nov 16 12:30 
2227    user1   PSUSP   normal    hostA                   job3       Nov 16 12:31 
View running jobs

Run bjobs -r to display running jobs.

View done jobs

Run bjobs -d to display recently completed jobs.

View pending job information

Run bjobs -p to display the reason why a job is pending.

Run busers -w all to see the maximum pending job threshold for all users.

View suspension reasons

Run bjobs -s to display the reason why a job was suspended.

View chunk job wait status and wait reason

Run bhist -l to display jobs in WAIT status. Jobs are shown as Waiting ...

The bjobs -l command does not display a WAIT reason in the list of pending jobs.

View post-execution states

Run bhist to display the POST_DONE and POST_ERR states.

The resource usage of post-processing is not included in the job resource usage.

View exception status for jobs (bjobs)
Run bjobs to display job exceptions. bjobs -l shows exception information for unfinished jobs, and bjobs -x -l shows finished as well as unfinished jobs.

For example, the following bjobs command shows that job 2 is running longer than the configured JOB_OVERRUN threshold, and is consuming no CPU time. bjobs displays the job idle factor, and both job overrun and job idle exceptions. Job 1 finished before the configured JOB_UNDERRUN threshold, so bjobs shows exception status of underrun:
bjobs -x -l -a 
Job <2>, User , Project , Status , Queue , Command 
                      
Wed Aug 13 14:23:35: Submitted from host , CWD <$HOME>, Output File 
                     , Specified Hosts ; 
Wed Aug 13 14:23:43: Started on , Execution Home , Execution  
                     CWD ; 
Resource usage collected. 
                     IDLE_FACTOR(cputime/runtime):   0.00 
                     MEM: 3 Mbytes;  SWAP: 4 Mbytes;  NTHREAD: 3 
                     PGID: 5027;  PIDs: 5027 5028 5029  
 
 SCHEDULING PARAMETERS: 
           r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem 
 loadSched   -     -     -     -       -     -    -     -     -      -      -   
 loadStop    -     -     -     -       -     -    -     -     -      -      -   
 
                cpuspeed    bandwidth 
 loadSched          -            - 
 loadStop           -            - 
 
 EXCEPTION STATUS:  overrun  idle 
------------------------------------------------------------------------------ 
 
Job <1>, User , Project , Status , Queue , Command 
                      
Wed Aug 13 14:18:00: Submitted from host , CWD <$HOME>, 
                     Output File , Specified Hosts < 
                     hostB>; 
Wed Aug 13 14:18:10: Started on , Execution Home , Execution  
                     CWD ; 
Wed Aug 13 14:18:50: Done successfully. The CPU time used is 0.2 seconds. 
 
 SCHEDULING PARAMETERS: 
           r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem 
 loadSched   -     -     -     -       -     -    -     -     -      -      -   
 loadStop    -     -     -     -       -     -    -     -     -      -      -   
 
                cpuspeed    bandwidth 
 loadSched          -            - 
 loadStop           -            - 
 
 EXCEPTION STATUS:  underrun 
 
Use bacct -l -x to trace the history of job exceptions.
Changing Job Order Within Queues

By default, LSF dispatches jobs in a queue in the order of arrival (that is, first-come, first-served), subject to availability of suitable server hosts.

Use the btop and bbot commands to change the position of pending jobs, or of pending job array elements, to affect the order in which jobs are considered for dispatch. Users can only change the relative position of their own jobs, and LSF administrators can change the position of any users' jobs.

bbot

Moves jobs relative to your last job in the queue.

If invoked by a regular user, bbot moves the selected job after the last job with the same priority submitted by the user to the queue.

If invoked by the LSF administrator, bbot moves the selected job after the last job with the same priority submitted to the queue.

btop

Moves jobs relative to your first job in the queue.

If invoked by a regular user, btop moves the selected job before the first job with the same priority submitted by the user to the queue.

If invoked by the LSF administrator, btop moves the selected job before the first job with the same priority submitted to the queue.

Moving a job to the top of the queue

In the following example, job 5311 is moved to the top of the queue. Since job 5308 is already running, job 5311 is placed in the queue after job 5308.

Note that user1's job is still in the same position on the queue. user2 cannot use btop to get extra jobs at the top of the queue; when one of his jobs moves up the queue, the rest of his jobs move down.
bjobs -u all 
JOBID USER  STAT  QUEUE    FROM_HOST  EXEC_HOST  JOB_NAME   SUBMIT_TIME 
5308  user2 RUN   normal   hostA      hostD      /s500     Oct 23 10:16 
5309  user2 PEND  night    hostA                 /s200     Oct 23 11:04 
5310  user1 PEND  night    hostB                 /myjob    Oct 23 13:45 
5311  user2 PEND  night    hostA                 /s700     Oct 23 18:17 
 
btop 5311 
Job <5311> has been moved to position 1 from top. 
 
bjobs -u all 
JOBID USER  STAT  QUEUE    FROM_HOST  EXEC_HOST  JOB_NAME   SUBMIT_TIME 
5308  user2 RUN   normal   hostA      hostD      /s500     Oct 23 10:16 
5311  user2 PEND  night    hostA                 /s200     Oct 23 18:17 
5310  user1 PEND  night    hostB                 /myjob    Oct 23 13:45 
5309  user2 PEND  night    hostA                 /s700     Oct 23 11:04 
Switch Jobs from One Queue to Another

You can use the command bswitch to change jobs from one queue to another. This is useful if you submit a job to the wrong queue, or if the job is suspended because of queue thresholds or run windows and you would like to resume the job.

Switch a single job to a different queue
Run bswitch to move pending and running jobs from queue to queue.

In the following example, job 5309 is switched to the priority queue:
bswitch priority 5309
Job <5309> is switched to queue  
bjobs -u all
JOBID    USER   STAT   QUEUE    FROM_HOST  EXEC_HOST   JOB_NAME   SUBMIT_TIME
5308     user2   RUN   normal   hostA      hostD       /job500    Oct 23 10:16
5309     user2   RUN   priority hostA      hostB       /job200    Oct 23 11:04
5311     user2   PEND  night    hostA                  /job700    Oct 23 18:17
5310     user1   PEND  night    hostB                  /myjob     Oct 23 13:45 
Switch all jobs to a different queue
Run bswitch -q from_queue to_queue 0 to switch all the jobs in a queue to another queue.

The -q option is used to operate on all jobs in a queue. The job ID number 0 specifies that all jobs from the night queue should be switched to the idle queue:

The example below selects jobs from the night queue and switches them to the idle queue.
bswitch -q night idle 0
Job <5308> is switched to queue 
Job <5310> is switched to queue  
Forcing Job Execution

A pending job can be forced to run with the brun command. This operation can only be performed by an LSF administrator.

You can force a job to run on a particular host, to run until completion, and other restrictions. For more information, see the brun command.

When a job is forced to run, any other constraints associated with the job such as resource requirements or dependency conditions are ignored.

In this situation you may see some job slot limits, such as the maximum number of jobs that can run on a host, being violated. A job that is forced to run cannot be preempted.

Force a pending job to run
Run brun -m hostname job_ID to force a pending job to run.

You must specify the host on which the job will run.

For example, the following command will force the sequential job 104 to run on hostA:
brun -m hostA 104 
Suspending and Resuming Jobs

A job can be suspended by its owner or the LSF administrator. These jobs are considered user-suspended and are displayed by bjobs as USUSP.

If a user suspends a high priority job from a non-preemptive queue, the load may become low enough for LSF to start a lower priority job in its place. The load created by the low priority job can prevent the high priority job from resuming. This can be avoided by configuring preemptive queues.

Suspend a job
Run bstop job_ID.

Your job goes into USUSP state if the job is already started, or into PSUSP state if it is pending.
bstop 3421
Job <3421> is being stopped 
 
The above example suspends job 3421.
UNIX

bstop sends the following signals to the job:

SIGTSTP for parallel or interactive jobs-SIGTSTP is caught by the master process and passed to all the slave processes running on other hosts.

SIGSTOP for sequential jobs-SIGSTOP cannot be caught by user programs. The SIGSTOP signal can be configured with the LSB_SIGSTOP parameter in lsf.conf.

Windows

bstop causes the job to be suspended.

Resume a job
Run bresume job_ID:
bresume 3421
Job <3421> is being resumed 
 
resumes job 3421.

 
Resuming a user-suspended job does not put your job into RUN state immediately. If your job was running before the suspension, bresume first puts your job into SSUSP state and then waits for sbatchd to schedule it according to the load conditions.
Killing Jobs

The bkill command cancels pending batch jobs and sends signals to running jobs. By default, on UNIX, bkill sends the SIGKILL signal to running jobs.

Before SIGKILL is sent, SIGINT and SIGTERM are sent to give the job a chance to catch the signals and clean up. The signals are forwarded from mbatchd to sbatchd. sbatchd waits for the job to exit before reporting the status. Because of these delays, for a short period of time after the bkill command has been issued, bjobs may still report that the job is running.

On Windows, job control messages replace the SIGINT and SIGTERM signals, and termination is implemented by the TerminateProcess() system call.

Kill a job
Run bkill job_ID. For example, the following command kills job 3421:
bkill 3421
Job <3421> is being terminated 
Kill multiple jobs
Run bkill 0 to kill all pending jobs in the cluster or use bkill 0 with the -g, -J, -m, -q, or -u options to kill all jobs that satisfy these options.

The following command kills all jobs dispatched to the hostA host:
bkill -m hostA 0 
Job <267> is being terminated 
Job <268> is being terminated 
Job <271> is being terminated 
 
The following command kills all jobs in the groupA job group:

bkill -g groupA 0 
Job <2083> is being terminated 
Job <2085> is being terminated 
Kill a large number of jobs rapidly

Killing multiple jobs with bkill 0 and other commands is usually sufficient for moderate numbers of jobs. However, killing a large number of jobs (approximately greater than 1000 jobs) can take a long time to finish.

Run bkill -b to kill a large number of jobs faster than with normal means. However, jobs killed in this manner are not logged to lsb.acct.

Local pending jobs are killed immediately and cleaned up as soon as possible, ignoring the time interval specified by CLEAN_PERIOD in lsb.params. Other jobs are killed as soon as possible but cleaned up normally (after the CLEAN_PERIOD time interval).

If the -b option is used with bkill 0, it kills all applicable jobs and silently skips the jobs that cannot be killed.

The -b option is ignored if used with -r or -s.

Force removal of a job from LSF

Run bkill -r to force the removal of the job from LSF. Use this option when a job cannot be killed in the operating system.

The bkill -r command removes a job from the LSF system without waiting for the job to terminate in the operating system. This sends the same series of signals as bkill without -r, except that the job is removed from the system immediately, the job is marked as EXIT, and job resources that LSF monitors are released as soon as LSF receives the first signal.

Sending a Signal to a Job

LSF uses signals to control jobs, to enforce scheduling policies, or in response to user requests. The principal signals LSF uses are SIGSTOP to suspend a job, SIGCONT to resume a job, and SIGKILL to terminate a job.

Occasionally, you may want to override the default actions. For example, instead of suspending a job, you might want to kill or checkpoint it. You can override the default job control actions by defining the JOB_CONTROLS parameter in your queue configuration. Each queue can have its separate job control actions.

You can also send a signal directly to a job. You cannot send arbitrary signals to a pending job; most signals are only valid for running jobs. However, LSF does allow you to kill, suspend and resume pending jobs.

You must be the owner of a job or an LSF administrator to send signals to a job.

You use the bkill -s command to send a signal to a job. If you issue bkill without the -s option, a SIGKILL signal is sent to the specified jobs to kill them. Twenty seconds before SIGKILL is sent, SIGTERM and SIGINT are sent to give the job a chance to catch the signals and clean up.

On Windows, job control messages replace the SIGINT and SIGTERM signals, but only customized applications are able to process them. Termination is implemented by the TerminateProcess() system call.

Signals on different platforms

LSF translates signal numbers across different platforms because different host types may have different signal numbering. The real meaning of a specific signal is interpreted by the machine from which the bkill command is issued.

For example, if you send signal 18 from a SunOS 4.x host, it means SIGTSTP. If the job is running on HP-UX and SIGTSTP is defined as signal number 25, LSF sends signal 25 to the job.

Send a signal to a job

On most versions of UNIX, signal names and numbers are listed in the kill(1) or signal(2) man pages. On Windows, only customized applications are able to process job control messages specified with the -s option.
Run bkill -s signal job_id, where signal is either the signal name or the signal number:
bkill -s TSTP 3421
Job <3421> is being signaled 
 
The above example sends the TSTP signal to job 3421.
Using Job Groups

A collection of jobs can be organized into job groups for easy management. A job group is a container for jobs in much the same way that a directory in a file system is a container for files. For example, a payroll application may have one group of jobs that calculates weekly payments, another job group for calculating monthly salaries, and a third job group that handles the salaries of part-time or contract employees. Users can submit, view, and control jobs according to their groups rather than looking at individual jobs.

How job groups are created

Job groups can be created explicitly or implicitly:

A job group is created explicitly with the bgadd command.

A job group is created implicitly by the bsub -g or bmod -g command when the specified group does not exist. Job groups are also created implicitly when a default job group is configured (DEFAULT_JOBGROUP in lsb.params or LSB_DEFAULT_JOBGROUP environment variable).

Job groups created when jobs are attached to an SLA service class at submission are implicit job groups (bsub -sla service_class_name -g job_group_name). Job groups attached to an SLA service class with bgadd are explicit job groups (bgadd -sla service_class_name job_group_name).

The GRP_ADD event in lsb.events indicates how the job group was created:

0x01 - job group was created explicitly

0x02 - job group was created implicitly

For example:
GRP_ADD" "7.02" 1193032735 1285 1193032735 0 "/Z" "" "user1" "" "" 2 0 "" -1 1 
means job group /Z is an explicitly created job group.

Child groups can be created explicitly or implicitly under any job group.

Only an implicitly created job group which has no job group limit (bgadd -L) and is not attached to any SLA can be automatically deleted once it becomes empty. An empty job group is a job group that has no jobs associated with it (including finished jobs). NJOBS displayed by bjgroup is 0.

Job group hierarchy

Jobs in job groups are organized into a hierarchical tree similar to the directory structure of a file system. Like a file system, the tree contains groups (which are like directories) and jobs (which are like files). Each group can contain other groups or individual jobs. Job groups are created independently of jobs, and can have dependency conditions which control when jobs within the group are considered for scheduling.

Job group path

The job group path is the name and location of a job group within the job group hierarchy. Multiple levels of job groups can be defined to form a hierarchical tree. A job group can contain jobs and sub-groups.

Root job group

LSF maintains a single tree under which all jobs in the system are organized. The top-most level of the tree is represented by a top-level "root" job group, named "/". The root group is owned by the primary LSF Administrator and cannot be removed. Users and administrators create new groups under the root group. By default, if you do not specify a job group path name when submitting a job, the job is created under the top-level "root" job group, named "/".

The root job group is not displayed by job group query commands, and you cannot specify the root job in commands.

Job group owner

Each group is owned by the user who created it. The login name of the user who creates the job group is the job group owner. Users can add job groups into a groups that are owned by other users, and they can submit jobs to groups owned by other users. Child job groups are owned by the creator of the job group and the creators of any parent groups.

Job control under job groups

Job owners can control their own jobs attached to job groups as usual. Job group owners can also control any job under the groups they own and below.

For example:

Job group /A is created by user1

Job group /A/B is created by user2

Job group /A/B/C is created by user3

All users can submit jobs to any job group, and control the jobs they own in all job groups. For jobs submitted by other users:

user1 can control jobs submitted by other users in all 3 job groups: /A, /A/B, and /A/B/C

user2 can control jobs submitted by other users only in 2 job groups: /A/B and /A/B/C

user3 can control jobs submitted by other users only in job group /A/B/C

The LSF administrator can control jobs in any job group.

Default job group

You can specify a default job group for jobs submitted without explicitly specifying a job group. LSF associates the job with the job group specified with DEFAULT_JOBGROUP in lsb.params. The LSB_DEFAULT_JOBGROUP environment variable overrides the setting of DEFAULT_JOBGROUP. The bsub -g job_group_name option overrides both LSB_DEFAULT_JOBGROUP and DEFAULT_JOBGROUP.

Default job group specification supports macro substitution for project name (%p) and user name (%u). When you specify bsub -P project_name, the value of %p is the specified project name. If you do not specify a project name at job submission, %p is the project name defined by setting the environment variable LSB_DEFAULTPROJECT, or the project name specified by DEFAULT_PROJECT in lsb.params. the default project name is default.

For example, a default job group name specified by DEFAULT_JOBGROUP=/canada/%p/%u is expanded to the value for the LSF project name and the user name of the job submission user (for example, /canada/projects/user1).

Job group names must follow this format:

Job group names must start with a slash character (/). For example, DEFAULT_JOBGROUP=/A/B/C is correct, but DEFAULT_JOBGROUP=A/B/C is not correct.

Job group names cannot end with a slash character (/). For example, DEFAULT_JOBGROUP=/A/ is not correct.

Job group names cannot contain more than one slash character (/) in a row. For example, job group names like DEFAULT_JOBGROUP=/A//B or DEFAULT_JOBGROUP=AB are not correct.

Job group names cannot contain spaces. For example, DEFAULT_JOBGROUP=/A/B C/D is not correct.

Project names and user names used for macro substitution with %p and %u cannot start or end with slash character (/).

Project names and user names used for macro substitution with %p and %u cannot contain spaces or more than one slash character (/) in a row.

Project names or user names containing slash character (/) will create separate job groups. For example, if the project name is canada/projects, DEFAULT_JOBGROUP=/%p results in a job group hierarchy /canada/projects.

Job group limits

Job group limits specified with bgadd -L apply to the job group hierarchy. The job group limit is a positive number greater than or equal to zero (0), specifying the maximum number of running and suspended jobs under the job group (including child groups). If limit is zero (0), no jobs under the job group can run.

By default, a job group has no limit. Limits persist across mbatchd restart and reconfiguration.

You cannot specify a limit for the root job group. The root job group has no job limit. Job groups added with no limits specified inherit any limits of existing parent job groups. The -L option only limits the lowest level job group created.

The maximum number of running and suspended jobs (including USUSP and SSUSP) in a job group cannot exceed the limit defined on the job group and its parent job group.

The job group limit is based on the number of running and suspended jobs in the job group. If you specify a job group limit as 2, at most 2 jobs can run under the group at any time, regardless of how many jobs or job slots are used. If the currently available job slots is zero (0), even if the job group job limit is not exceeded, LSF cannot dispatch a job to the job group.

If a parallel job requests 2 CPUs (bsub -n 2), the job group limit is per job, not per slots used by the job.

A job array may also be under a job group, so job arrays also support job group limits.

Job group limits are not supported at job submission for job groups created automatically with bsub -g. Use bgadd -L before job submission.

Jobs forwarded to the execution cluster in a MultiCluster environment are not counted towards the job group limit.

Examples
bgadd -L 6 /canada/projects/test 
If /canada is existing job group, and /canada/projects and /canada/projects/test are new groups, only the job group /canada/projects/test is limited to 6 running and suspended jobs. Job group /canada/projects will have whatever limit is specified for its parent job group /canada. The limit of /canada does not change.

The limits on child job groups cannot exceed the parent job group limit. For example, if /canada/projects has a limit of 5:
bgadd -L 6 /canada/projects/test 
is rejected because /canada/projects/test attempts to increase the limit of its parent /canada/projects from 5 to 6.

Example job group hierarchy with limits

In this configuration:

Every node is a job group, including the root (/) job group

The root (/) job group cannot have any limit definition

By default, child groups have the same limit definition as their direct parent group, so /asia, /asia/projects, and /asia/projects/test all have no limit

The number of running and suspended jobs in a job group (including all of its child groups) cannot exceed the defined limit

If there are 7 running or suspended jobs in job group /canada/projects/test1, even though the job limit of group /canada/qa/auto is 6, /canada/qa/auto can only have a maximum of 5 running and suspended (12-7=5)

When a job is submitted to a job group, LSF checks the limits for the entire job group. For example, for a job is submitted to job group /canada/qa/auto, LSF checks the limits on groups /canada/qa/auto, /canada/qa and /canada. If any one limit in the branch of the hierarchy is exceeded, the job remains pending

The zero (0) job limit for job group /canada/qa/manual means no job in the job group can enter running status

Create a job group

Use the bgadd command to create a new job group.

You must provide full group path name for the new job group. The last component of the path is the name of the new group to be created:

bgadd /risk_group

The above example creates a job group named risk_group under the root group /.

bgadd /risk_group/portfolio1

The above example creates a job group named portfolio1 under job group /risk_group.

bgadd /risk_group/portfolio1/current

The above example creates a job group named current under job group /risk_group/portfolio1.

If the group hierarchy /risk_group/portfolio1/current does not exist, LSF checks its parent recursively, and if no groups in the hierarchy exist, all three job groups are created with the specified hierarchy.

Add a job group limit (bgadd)
Run bgadd -L limit /job_group_name to specify a job limit for a job group.

Where limit is a positive number greater than or equal to zero (0), specifying the maximum the number of running and suspended jobs under the job group (including child groups) If limit is zero (0), no jobs under the job group can run.

For example:
bgadd -L 6 /canada/projects/test 
 
If /canada is existing job group, and /canada/projects and /canada/projects/test are new groups, only the job group /canada/projects/test is limited to 6 running and suspended jobs. Job group /canada/projects will have whatever limit is specified for its parent job group /canada. The limit of /canada does not change. 
Submit jobs under a job group
Use the -g option of bsub to submit a job into a job group.

The job group does not have to exist before submitting the job.
bsub -g /risk_group/portfolio1/current myjob 
Job <105> is submitted to default queue. 
 
Submits myjob to the job group /risk_group/portfolio1/current. 

 
If group /risk_group/portfolio1/current exists, job 105 is attached to the job group.

 
If group /risk_group/portfolio1/current does not exist, LSF checks its parent recursively, and if no groups in the hierarchy exist, all three job groups are created with the specified hierarchy and the job is attached to group.
-g and -sla options

tip:

Use -sla with -g to attach all jobs in a job group to a service class and have them scheduled as SLA jobs. Multiple job groups can be created under the same SLA. You can submit additional jobs to the job group without specifying the service class name again.

MultiCluster

In a MultiCluster job forwarding mode, job groups only apply on the submission cluster, not on the execution cluster. LSF treats the execution cluster as execution engine, and only enforces job group policies at the submission cluster.

Jobs forwarded to the execution cluster in a MultiCluster environment are not counted towards job group limits.

View jobs in job groups

View job group information, and jobs running in specific job groups.

View information about job groups (bjgroup)
Use the bjgroup command to see information about jobs in job groups.
bjgroup 
GROUP_NAME         NJOBS   PEND    RUN    SSUSP  USUSP  FINISH  SLA   JLIMIT  OWNER 
/A                 0       0       0      0      0      0        ()    0/10  user1 
/X                 0       0       0      0      0      0        ()     0/-  user2 
/A/B               0       0       0      0      0      0        ()     0/5  user1 
/X/Y               0       0       0      0      0      0        ()     0/5  user2 
Use bjgroup -s to sort job groups by group hierarchy.

For example, for job groups named /A, /A/B, /X and /X/Y, bjgroup -s displays:
bjgroup -s 
GROUP_NAME         NJOBS   PEND    RUN    SSUSP  USUSP  FINISH  SLA   JLIMIT  OWNER 
/A                 0       0       0      0      0      0       ()       0/10  user1 
/A/B               0       0       0      0      0      0       ()       0/5  user1 
/X                 0       0       0      0      0      0       ()       0/-  user2 
/X/Y               0       0       0      0      0      0       ()       0/5  user2 
Specify a job group name to show the hierarchy of a single job group:
bjgroup -s /X 
GROUP_NAME   NJOBS  PEND   RUN   SSUSP  USUSP  FINISH       SLA   JLIMIT  OWNER 
/X              25     0    25       0      0       0   puccini  25/100   user1 
/X/Y            20     0    20       0      0       0   puccini   20/30   user1 
/X/Z             5     0     5       0      0       0   puccini    5/10   user2 
Specify a job group name with a trailing slash character (/) to show only the root job group:
bjgroup -s /X/ 
GROUP_NAME   NJOBS  PEND   RUN   SSUSP  USUSP  FINISH      SLA   JLIMIT  OWNER 
/X               25    0    25       0      0       0   puccini  25/100  user1 
Use bjgroup -N to display job group information by job slots instead of number of jobs. NSLOTS, PEND, RUN, SSUSP, USUSP, RSV are all counted in slots rather than number of jobs:
bjgroup -N 
GROUP_NAME NSLOTS PEND   RUN   SSUSP  USUSP   RSV      SLA     OWNER 
/X             25    0    25       0      0     0  puccini     user1 
/A/B           20    0    20       0      0     0   wagner     batch 
 
-N by itself shows job slot info for all job groups, and can combine with -s to sort the job groups by hierarchy:

bjgroup -N -s 
GROUP_NAME NSLOTS PEND   RUN   SSUSP   USUSP  RSV      SLA     OWNER 
/A              0    0     0       0       0    0   wagner      batch 
/A/B            0    0     0       0       0    0   wagner      user1 
/X             25    0    25       0       0    0   puccini     user1 
/X/Y           20    0    20       0       0    0   puccini     batch 
/X/Z            5     0    5       0       0    0   puccini     batch 
View jobs for a specific job group (bjobs)
Run bjobs -g and specify a job group path to view jobs attached to the specified group.
bjobs -g /risk_group
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
113     user1   PEND  normal     hostA                   myjob     Jun 17 16:15
111     user2   RUN   normal     hostA       hostA       myjob     Jun 14 15:13
110     user1   RUN   normal     hostB       hostA       myjob     Jun 12 05:03
104     user3   RUN   normal     hostA       hostC       myjob     Jun 11 13:18 
 
bjobs -l displays the full path to the group to which a job is attached:

bjobs -l -g /risk_group

Job <101>, User , Project , Job Group 
, Status , Queue , Command 
Tue Jun 17 16:21:49: Submitted from host , CWD 
;
... 
Control jobs in job groups

Suspend and resume jobs in job groups, move jobs to different job groups, terminate jobs in job groups, and delete job groups.

Suspend jobs (bstop)
Use the -g option of bstop and specify a job group path to suspend jobs in a job group
bstop -g /risk_group 106
Job <106> is being stopped 
Use job ID 0 (zero) to suspend all jobs in a job group:
bstop -g /risk_group/consolidate 0
Job <107> is being stopped
Job <108> is being stopped
Job <109> is being stopped 
Resume suspended jobs (bresume)
Use the -g option of bresume and specify a job group path to resume suspended jobs in a job group:
bresume -g /risk_group 106
Job <106> is being resumed 
Use job ID 0 (zero) to resume all jobs in a job group:
bresume -g /risk_group 0
Job <109> is being resumed
Job <110> is being resumed
Job <112> is being resumed 
Move jobs to a different job group (bmod)
Use the -g option of bmod and specify a job group path to move a job or a job array from one job group to another.
bmod -g /risk_group/portfolio2/monthly 105 
 
moves job 105 to job group /risk_group/portfolio2/monthly.

 
Like bsub -g, if the job group does not exist, LSF creates it.

 
bmod -g cannot be combined with other bmod options. It can only operate on pending jobs. It cannot operate on running or finished jobs.

 
You can modify your own job groups and job groups that other users create under your job groups. The LSF administrator can modify job groups of all users.

 
You cannot move job array elements from one job group to another, only entire job arrays. If any job array elements in a job array are running, you cannot move the job array to another group. A job array can only belong to one job group at a time.

 
You cannot modify the job group of a job attached to a service class.

 
bhist -l shows job group modification information:

bhist -l 105

Job <105>, User , Project , Job Group , Command 
                     
Wed May 14 15:24:07: Submitted from host , to Queue , CWD
<$HOME/lsf51/5.1/sparc-sol7-64/bin>;
Wed May 14 15:24:10: Parameters of Job are changed:
                         Job group changes to: /risk_group/portfolio2/monthly;
Wed May 14 15:24:17: Dispatched to ;
Wed May 14 15:24:17: Starting (Pid 8602);
... 
Terminate jobs (bkill)
Use the -g option of bkill and specify a job group path to terminate jobs in a job group.
bkill -g /risk_group 106
Job <106> is being terminated 
Use job ID 0 (zero) to terminate all jobs in a job group:
bkill -g /risk_group 0
Job <1413> is being terminated
Job <1414> is being terminated
Job <1415> is being terminated
Job <1416> is being terminated 
 
bkill only kills jobs in the job group you specify. It does not kill jobs in lower level job groups in the path. For example, jobs are attached to job groups /risk_group and /risk_group/consolidate:

bsub -g /risk_group  myjob
Job <115> is submitted to default queue . 
bsub -g /risk_group/consolidate myjob2
Job <116> is submitted to default queue . 
 
The following bkill command only kills jobs in /risk_group, not the subgroup /risk_group/consolidate:

bkill -g /risk_group 0
Job <115> is being terminated 
 
To kill jobs in /risk_group/consolidate, specify the path to the consolidate job group explicitly:

bkill -g /risk_group/consolidate 0
Job <116> is being terminated 
Delete a job groups manually (bgdel)
Use the bgdel command to manually remove a job group. The job group cannot contain any jobs.
bgdel /risk_group
Job group /risk_group is deleted. 
 
deletes the job group /risk_group and all its subgroups.

 
Normal users can only delete the empty groups they own that are specified by the requested job_group_name. These groups can be explicit or implicit. 
Run bgdel 0 to delete all empty job groups you own. Theses groups can be explicit or implicit.

LSF administrators can use bgdel -u user_name 0 to delete all empty job groups created by specific users. These groups can be explicit or implicit.

Run bgdel -u all 0 to delete all the users' empty job groups and their sub groups. LSF administrators can delete empty job groups created by any user. These groups can be explicit or implicit.

Run bgdel -c job_group_name to delete all empty groups below the requested job_group_name including job_group_name itself.
Modify a job group limit (bgmod)
Run bgmod to change a job group limit.
bgmod [-L limit | -Ln] /job_group_name 
 
-L limit changes the limit of job_group_name to the specified value. If the job group has parent job groups, the new limit cannot exceed the limits of any higher level job groups. Similarly, if the job group has child job groups, the new value must be greater than any limits on the lower level job groups.

 
-Ln removes the existing job limit for the job group. If the the job group has parent job groups, the job modified group automatically inherits any limits from its direct parent job group.

 
You must provide full group path name for the modified job group. The last component of the path is the name of the job group to be modified. 

 
Only root, LSF administrators, or the job group creator, or the creator of the parent  job groups can use bgmod to modify a job group limit.

 
The following command only modifies the limit of group /canada/projects/test1. It does not modify limits of /canada or/canada/projects. 

bgmod -L 6 /canada/projects/test1 
 
To modify limits of /canada or/canada/projects, you must specify the exact group name:

bgmod -L 6 /canada 
 
or 

bgmod -L 6 /canada/projects 
Automatic job group cleanup

When an implicitly created job group becomes empty, it can be automatically deleted by LSF. Job groups that can be automatically deleted cannot:

Have limits specified including their child groups

Have explicitly created child job groups

Be attached to any SLA

Configure JOB_GROUP_CLEAN=Y in lsb.params to enable automatic job group deletion.

For example, for the following job groups:

When automatic job group deletion is enabled, LSF only deletes job groups /X/Y/Z/W and /X/Y/Z. Job group /X/Y is not deleted because it is an explicitly created job group, Job group /X is also not deleted because it has an explicitly created child job group /X/Y.

Automatic job group deletion does not delete job groups attached to SLA service classes. Use bgdel to manually delete job groups attached to SLAs.

Handling Job Exceptions

You can configure hosts and queues so that LSF detects exceptional conditions while jobs are running, and take appropriate action automatically. You can customize what exceptions are detected and their corresponding actions. By default, LSF does not detect any exceptions.

Run bjobs -d -m host_name to see exited jobs for a particular host.

Job exceptions LSF can detect

If you configure job exception handling in your queues, LSF detects the following job exceptions:

Job underrun - jobs end too soon (run time is less than expected). Underrun jobs are detected when a job exits abnormally

Job overrun - job runs too long (run time is longer than expected). By default, LSF checks for overrun jobs every 1 minute. Use EADMIN_TRIGGER_DURATION in lsb.params to change how frequently LSF checks for job overrun.

Job estimated run time exceeded- the job's actual run time has exceeded the estimated run time.

Idle job - running job consumes less CPU time than expected (in terms of CPU time/runtime). By default, LSF checks for idle jobs every 1 minute. Use EADMIN_TRIGGER_DURATION in lsb.params to change how frequently LSF checks for idle jobs.

Host exceptions LSF can detect

If you configure host exception handling, LSF can detect jobs that exit repeatedly on a host. The host can still be available to accept jobs, but some other problem prevents the jobs from running. Typically jobs dispatched to such "black hole", or "job-eating" hosts exit abnormally. By default, LSF monitors the job exit rate for hosts, and closes the host if the rate exceeds a threshold you configure (EXIT_RATE in lsb.hosts).

If EXIT_RATE is not specified for the host, LSF invokes eadmin if the job exit rate for a host remains above the configured threshold for longer than 5 minutes. Use JOB_EXIT_RATE_DURATION in lsb.params to change how frequently LSF checks the job exit rate.

Use GLOBAL_EXIT_RATE in lsb.params to set a cluster-wide threshold in minutes for exited jobs. If EXIT_RATE is not specified for the host in lsb.hosts, GLOBAL_EXIT_RATE defines a default exit rate for all hosts in the cluster. Host-level EXIT_RATE overrides the GLOBAL_EXIT_RATE value.

Customize job exception actions with the eadmin script

When an exception is detected, LSF takes appropriate action by running the script LSF_SERVERDIR/eadmin on the master host.

You can customize eadmin to suit the requirements of your site. For example, eadmin could find out the owner of the problem jobs and use bstop -u to stop all jobs that belong to the user.

In some environments, a job running 1 hour would be an overrun job, while this may be a normal job in other environments. If your configuration considers jobs running longer than 1 hour to be overrun jobs, you may want to close the queue when LSF detects a job that has run longer than 1 hour and invokes eadmin.

Email job exception details

Set LSF to send you an email about job exceptions that includes details including JOB_ID, RUN_TIME, IDLE_FACTOR (if job has been idle), USER, QUEUE, EXEC_HOST, and JOB_NAME.

In lsb.params, set EXTEND_JOB_EXCEPTION_NOTIFY=Y.

Set the format option in the eadmin script (LSF_SERVERDIR/eadmin on the master host).

Uncomment the JOB_EXCEPTION_EMAIL_FORMAT line and add a value for the format:

JOB_EXCEPTION_EMAIL_FORMAT=fixed: The eadmin shell generates an exception email with a fixed length for the job exception information. For any given field, the characters truncate when the maximum is reached (between 10-19).

JOB_EXCEPTION_EMAIL_FORMAT=full: The eadmin shell generates an exception email without a fixed length for the job exception information.

Default eadmin actions

For host-level exceptions, LSF closes the host and sends email to the LSF administrator. The email contains the host name, job exit rate for the host, and other host information. The message eadmin: JOB EXIT THRESHOLD EXCEEDED is attached to the closed host event in lsb.events, and displayed by badmin hist and badmin hhist.

For job exceptions. LSF sends email to the LSF administrator. The email contains the job ID, exception type (overrun, underrun, idle job), and other job information.

An email is sent for all detected job exceptions according to the frequency configured by EADMIN_TRIGGER_DURATION in lsb.params. For example, if EADMIN_TRIGGER_DURATION is set to 5 minutes, and 1 overrun job and 2 idle jobs are detected, after 5 minutes, eadmin is invoked and only one email is sent. If another overrun job is detected in the next 5 minutes, another email is sent.

Handling job initialization failures

By default, LSF handles job exceptions for jobs that exit after they have started running. You can also configure LSF to handle jobs that exit during initialization because of an execution environment problem, or because of a user action or LSF policy.

LSF detects that the jobs are exiting before they actually start running, and takes appropriate action when the job exit rate exceeds the threshold for specific hosts (EXIT_RATE in lsb.hosts) or for all hosts (GLOBAL_EXIT_RATE in lsb.params).

Use EXIT_RATE_TYPE in lsb.params to include job initialization failures in the exit rate calculation. The following table summarizes the exit rate types you can configure:

Table 1: Exit rate types you can configure

Exit rate type ...

Includes ...

JOBEXIT

Local exited jobs

Remote job initialization failures

Parallel job initialization failures on hosts other than the first execution host

Jobs exited by user action (e.g., bkill, bstop, etc.) or LSF policy (e.g., load threshold exceeded, job control action, advance reservation expired, etc.)

JOBEXIT_NONLSF

This is the default when EXIT_RATE_TYPE is not set

Local exited jobs

Remote job initialization failures

Parallel job initialization failures on hosts other than the first execution host

JOBINIT

Local job initialization failures

Parallel job initialization failures on the first execution host

HPCINIT

Job initialization failures for Platform LSF HPC jobs

Job exits excluded from exit rate calculation

By default, jobs that are exited for non-host related reasons (user actions and LSF policies) are not counted in the exit rate calculation. Only jobs that are exited for what LSF considers host-related problems and are used to calculate a host exit rate.

The following cases are not included in the exit rate calculations:

bkill, bkill -r

brequeue

RERUNNABLE jobs killed when a host is unavailable

Resource usage limit exceeded (for example, PROCESSLIMIT, CPULIMIT, etc.)

Queue-level job control action TERMINATE and TERMINATE_WHEN

Checkpointing a job with the kill option (bchkpnt -k)

Rerunnable job migration

Job killed when an advance reservation has expired

Remote lease job start fails

Any jobs with an exit code found in SUCCESS_EXIT_VALUES, where a particular exit value is deemed as successful.

Excluding LSF and user-related job exits

To explicitly exclude jobs exited because of user actions or LSF-related policies from the job exit calculation, set EXIT_RATE_TYPE = JOBEXIT_NONLSF in lsb.params. JOBEXIT_NONLSF tells LSF to include all job exits except those that are related to user action or LSF policy. This is the default value for EXIT_RATE_TYPE .

To include all job exit cases in the exit rate count, you must set EXIT_RATE_TYPE = JOBEXIT in lsb.params. JOBEXIT considers all job exits.

Jobs killed by signal external to LSF will still be counted towards exit rate

Jobs killed because of job control SUSPEND action and RESUME action are still counted towards the exit rate. This because LSF cannot distinguish between jobs killed from SUSPEND action and jobs killed by external signals.

If both JOBEXIT and JOBEXIT_NONLSF are defined, JOBEXIT_NONLSF is used.

Local jobs

When EXIT_RATE_TYPE=JOBINIT, various job initialization failures are included in the exit rate calculation, including:

Host-related failures; for example, incorrect user account, user permissions, incorrect directories for checkpointable jobs, host name resolution failed, or other execution environment problems

Job-related failures; for example, pre-execution or setup problem, job file not created, etc.

Parallel jobs

By default, or when EXIT_RATE_TYPE=JOBEXIT_NONLSF, job initialization failure on the first execution host does not count in the job exit rate calculation. Job initialization failure for hosts other than the first execution host are counted in the exit rate calculation.

When EXIT_RATE_TYPE=JOBINIT, job initialization failure happens on the first execution host are counted in the job exit rate calculation. Job initialization failures for hosts other than the first execution host are not counted in the exit rate calculation.

tip:

For parallel job exit exceptions to be counted for all hosts, specify EXIT_RATE_TYPE=HPCINIT or EXIT_RATE_TYPE=JOBEXIT_NONLSF JOBINIT.

Remote jobs

By default, or when EXIT_RATE_TYPE=JOBEXIT_NONLSF, job initialization failures are counted as exited jobs on the remote execution host and are included in the exit rate calculation for that host. To include only local job initialization failures on the execution cluster from the exit rate calculation, set EXIT_RATE_TYPE to include only JOBINIT or HPCINIT.

Scaling and tuning job exit rate by number of slots

On large, multiprocessor hosts, use to ENABLE_EXIT_RATE_PER_SLOT=Y in lsb.params to scale the job exit rate so that the host is only closed when the job exit rate is high enough in proportion to the number of processors on the host. This avoids having a relatively low exit rate close a host inappropriately.

Use a float value for GLOBAL_EXIT_RATE in lsb.params to tune the exit rate on multislot hosts. The actual calculated exit rate value is never less than 1.

Example: exit rate of 5 on single processor and multiprocessor hosts

On a single-processor host, a job exit rate of 5 is much more severe than on a 20-processor host. If a stream of jobs to a single-processor host is consistently failing, it is reasonable to close the host or take some other action after 5 failures.

On the other hand, for the same stream of jobs on a 20-processor host, it is possible that 19 of the processors are busy doing other work that is running fine. To close this host after only 5 failures would be wrong because effectively less than 5% of the jobs on that host are actually failing.

Example: float value for GLOBAL_EXIT_RATE on multislot hosts

Using a float value for GLOBAL_EXIT_RATE allows the exit rate to be less than the number of slots on the host. For example, on a host with 4 slots, GLOBAL_EXIT_RATE=0.25 gives an exit rate of 1. The same value on an 8 slot machine would be 2 and so on. On a single-slot host, the value is never less than 1.

For more information

See Handling Host-level Job Exceptions for information about configuring host-level job exceptions.

See Handling Job Exceptions in Queues for information about configuring job exceptions. in queues

Job state	Description
PEND	Waiting in a queue for scheduling and dispatch
RUN	Dispatched to a host and running
DONE	Finished normally with a zero exit value

Job state	Description
PSUSP	Suspended by its owner or the LSF administrator while in PEND state
USUSP	Suspended by its owner or the LSF administrator after being dispatched
SSUSP	Suspended by the LSF system after being dispatched

Exit rate type ...	Includes ...
JOBEXIT	Local exited jobs Remote job initialization failures Parallel job initialization failures on hosts other than the first execution host Jobs exited by user action (e.g., bkill, bstop, etc.) or LSF policy (e.g., load threshold exceeded, job control action, advance reservation expired, etc.)
JOBEXIT_NONLSF This is the default when EXIT_RATE_TYPE is not set	Local exited jobs Remote job initialization failures Parallel job initialization failures on hosts other than the first execution host
JOBINIT	Local job initialization failures Parallel job initialization failures on the first execution host
HPCINIT	Job initialization failures for Platform LSF HPC jobs

Platform Computing Inc.
www.platform.com

Knowledge Center Contents Previous Next Index

http://www.ccs.miami.edu/hpc/lsf/7.0.6/admin/job_ops.html

http://www-01.ibm.com/support/knowledgecenter/SSETD4_9.1.3/lsf_command_ref/lsinfo.1.dita

http://www2.nchc.org.tw/~a00yys00/lsf7/7.0.6/lsf_using/index.htm?job_kill.html~main

你可能感兴趣的:(LSF)

为什么所有主机状态都是closed_lim？ BOBOLAOGE LSF使用技巧 EDA LSF EDA HPC
今天突然大量用户反馈LSF集群不能使用了，提交的任务都跑不了。感觉事态严重，立即上主管理节点上检查，所有base和batch命令都能正常运行，只是lsload和bhosts显示主机状态异常，所有计算节点的主机状态都是closed_lim。看了一下LSF服务进程，都还在。赶紧看主管理节点上的lim日志，在日志中发现了端倪，日志中有大量的记录：这些记录表示计算节点和管理节点上的资源数量不一样。看上去是
LSF使用技巧：应用程序退出码含义？ BOBOLAOGE LSF使用技巧 linux 运维服务器云计算
LSF中应用程序退出码的说明退出码说明0应用程序运行过程中没有发生错误，正常结束。1～125应用程序退出码，需要查看应用程序手册确定退出码的含义。有些应用程序非零退出码也代表正常结束。126用户没有权限执行命令127没有找到要执行的命令>128表示作业被信号中断，信号值为退出码-128，需要在相应操作系统上查看对应信号的涵义。如退出码130,130-128=2,在Linux平台信号2表示SIGIN
【常用bsub指令介绍】使用bsub命令提交作业、开启交互式窗口，在集群服务器上用pdb进行代码调试凌漪_ 集群服务器服务器 gpu算力 bug
目录1.LSF作业调度系统和服务器集群介绍2.bsub运行作业的两种方式2.1bsub直接提交作业2.2bsub开启交互式窗口3.使用pdb进行代码调试4.更多bsub指令分享1.LSF作业调度系统和服务器集群介绍在一个服务器集群中，有很多的人要使用，却只有很少的GPU。LSF作业调度系统则是对每个用户提交的作业和需要使用的GPU进行调度。一般使用bsub命令来将待运行的作业提交到集群上。用bsu
LSF 主机状态 unreach 分析 boshushuoshuo LSF使用技巧 LSF unreach EDA
在LSF集群运行过程中，有主机状态变为unreach。熟悉LSF的朋友都知道主机状态为unreach表示主机上的SBD服务中断服务了，但其它服务LIM和RES还在正常运行。影响分析那么主机上的SBD服务中断的影响是什么呢？我们需要先明白SBD服务的功能是什么。主机上SBD服务的功能主要是从MBD接收派发到主机上的任务并运行任务、向MBD报告任务的资源使用情况、监控任务运行状态。因此，如果SBD服务
下载并安装集成软件包以在 Cray Linux 系统上运行 LSF 小信瑞 LSF任务调度系统计算资源管理集群管理 linux 运维服务器集群管理计算资源管理高性能计算任务调度系统
程序1、在CrayLinux(在CrayXT/XE/XC上)集成上下载LSF集成的安装包和分发tar文件。例如，在LSFV10.1.0中，需要以下文件:lsf10.1.0_lnx26-lib23-x64-cray.tar.ZIntstaller软件包:lsf10.1.0_lsfinstall.tar.Z这是标准安装程序软件包。在具有除x86-64以外的混合系统(zLinux除外)的异构集群中使用此
虚拟机中CentOS-7.9的硬盘空间扩容(EXSI) 洛蕾计算机 centos flask linux 计算机
优质资源分享学习路线指引（点击解锁）知识定位人群定位Python实战微信订餐小程序进阶级本课程是pythonflask+微信小程序的完美结合，从项目搭建到腾讯云部署上线，打造一个全栈订餐系统。Python量化交易实战入门级手把手带你打造一个易扩展、更安全、效率更高的量化交易系统目录*一、增加虚机容量二、创建新的分区三、格式化新分区四、lvm实现卷扩容五、文件系统的扩容大家好，我是LSF，发现一台虚
芯片设计重要工具—— IBM LSF 分布式高性能计算调度平台小信瑞集群管理高性能计算计算资源管理分布式集群管理计算资源管理 IBM LSF 高性能计算 HPC 芯片设计工具
IBMSpectrum®LSF®Suites是面向分布式高性能计算(HPC)的工作负载管理平台和作业调度程序。基于Terraform的自动化现已可用，该功能可在IBMCloud®上为基于IBMSpectrumLSF的集群供应和配置资源。借助我们针对任务关键型HPC环境的集成解决方案，提高用户生产力和硬件使用，同时降低系统管理成本。异构、高扩展性和可用架构可为传统的高性能计算和高吞吐量工作负载提供支
SLURM作业管理系统之3种作业提交方式星猿杂谈 HPC高性能计算 Linux linux 服务器 HPC 调度系统
文章目录前言定义基本概念三种作业提交模式1.批处理作业（采用sbatch命令提交）2.交互式作业提交（采用srun命令提交）3.分配模式作业（采用salloc命令提交）管理节点部署Slurm常用命令前言在高性能计算（HPC）领域，作业调度系统是关键组件之一。IBM的LSF、澳汰尔的PBSProfessional，以及开源的Slurm是目前市场上常见的几种作业调度系统。本文将重点介绍开源调度系统Sl
LSF 守护程序和进程、集群通信路径和安全模型小信瑞计算资源管理 LSF任务调度系统集群管理服务器网络 linux 集群管理计算资源管理 LSF 任务调度系统高性能计算
LSF细观了解在LSF主机上运行的各种守护进程，LSF集群通信路径，以及LSF如何容许集群中的主机故障。1、LSF守护程序和进程集群中的每个主机上都运行多个LSF进程。正在运行的进程的类型和数量，取决于主机是主节点还是计算节点。主节点守护程序进程LSF主机根据它们在集群中的角色，运行各种守护进程。守护程序角色mbatchd作业请求与分配mbschd作业调度sbatchd作业执行res作业执行lim
IBM Spectrum LSF Data Manager 通过缓存数据传输优化集群吞吐量小信瑞集群管理计算资源管理 LSF任务调度系统集群管理计算资源管理 IBM LSF 高性能计算 HPC
IBMSpectrumLSFDataManager通过缓存数据传输优化集群吞吐量亮点●独立于群集工作负载管理数据传输，提高吞吐量，优化计算资源的使用；●利用智能托管缓存消除重复数据传输，降低存储成本；●使用IBM®SpectrumLSF调度策略全面了解和控制数据传输作业；●通过为入站/出站数据传输配置专用I/O节点来简化管理；●对缓存数据提供用户和管理员访问控制。随着全球组织将IT资源整合到更大的
IBM Spectrum LSF 常见问题小信瑞 LSF任务调度系统集群管理计算资源管理计算资源管理 HPC IBM LSF 任务调度系统计算资源共享高性能计算集群集群管理
IBMSpectrumLSF常见问题1、随此产品一起部署的集群中包含哪些SpectrumLSF软件包?包含以下SpectrumLSF程序:IBMSpectrumLSFStandardEditionIBMSpectrumLSF许可证调度程序IBMSpectrumLSF数据管理器2、哪些位置可用于部署VPC资源?可以在资源部署的位置中找到用于部署VPC资源的可用区域和区域以及这些资源到城市位置和数据中
2021-05-08Git 大刀劈向鬼子
在开始使用Git管理项目的版本之前，需要将它安装到计算机上，选择下载对应的Git安装包：https://git-scm.com/downloads工作区----暂存区----本地仓库----远程仓库一、git基础配置第一步：在项目目录中，通过鼠标右键打开“GitBash”1.配置自己的用户名和邮件地址gitconfig--globaluser.name"lsf"gitconfig--globalu
LSF错误排查：为什么任务状态自动切换为PSUSP ？ boshushuoshuo LSF使用技巧服务器 linux 运维
用户向LSF提交了一个交互式任务，但任务状态自动变成PSUSP。PSUSP状态通常是在提交任务时指定了-H参数，即要求任务保持挂起状态，只有当用户恢复任务后才等候被LSF调度运行。询问用户得知，用户提交时并未指定-H参数。首先查看任务的历史信息，运行命令bhist-ljobid，从输出信息中可以看到任务在初始化时失败。LSF服务进程是以root身份运行的，任务初始化过程中最重要的一步就是从root
LSF使用技巧：一次性提交大批量任务 boshushuoshuo LSF使用技巧云计算性能优化
问题用户反应通过脚本提交大批量（数千个）任务时，会发生任务丢失的情况。查看用户的脚本，示例如下：foriintask_list:...run(i.command)...defrun(command):lsf_cmd=f"bsub-qsim-R'rusage[mem=500000]'{command}"os.system(lsf_cmd)...可以看出这种方式是比较低效的，如果任务运行时间较长，且任
使用 Spectrum LSF 设置多集群和作业转发小信瑞计算资源管理集群管理 LSF任务调度系统服务器 linux IBM LSF 集群 HPC LSF 任务调度系统
使用SpectrumLSF设置多集群和作业转发以下示例是有关如何使用SpectrumLSF设置多集群和作业转发的指南。此示例说明了集群是本地集群，另一个在云中的常见情况。此示例假定标注为“OnPremiseCluster”的内部部署集群使用子网192.168.0.0/24，其管理主机使用192.168.0.4(内部部署管理)。标注为“HPCCluster”的云集群使用子网10.244.128.0/
为什么选择 IBM LSF? 小信瑞数据库高性能计算 HPC LSF 作业调度系统集群管理
IBMSpectrumComputing推出了全面的软件定义基础架构解决方案产品组合，从而优化资源利用率以缩短成果实现时间并降低成本，以高效地交付IT服务。IBMSpectrumComputing解决方案非常适合技术和HPC应用，旨在简化和加速高性能仿真和分析，以帮助发掘业务、产品和科学洞察。要点通过隐藏工作负载密集型计算环境的复杂性，提高用户生产效率简化计算集群管理并划分工作负载优先级，以快
在 IBM Cloud 上使用 Spectrum LSF 管理数据小信瑞 LSF任务调度系统集群管理计算资源管理计算资源管理 HPC 高性能计算 IBM LSF 任务调度系统集群集群管理
在IBMCloud上使用SpectrumLSF管理数据在云环境中处理HPC工作负载时，要解决的一个关键挑战是如何以最佳方式管理运行工作负载所需的数据，以及可能需要分析以进行进一步处理和决策的输出。通过使用部署在IBMCloud®上的IBM®SpectrumLSF集群，您可以使用以下方法来管理数据。使用IBMCloud进行混合设置如果您的安装正在使用VPN或直接链接将本地SpectrumLSF环境连
在 Cray Linux 上配置 LSF 集成小信瑞计算资源管理 LSF任务调度系统集群管理 java 前端数据库 linux LSF HPC
在CrayLinux上设置LSF集成的配置参数过程1.修改$LSF_ENVDIR/lsf.conf。LSF安装可能已添加以下部分参数:LSB_SHAREDIR=/ufs/lsfhpc/work-可供root用户和LSF管理员在管理主机和CrayLinux登录/服务节点上访问的共享文件系统。LSF_LOGDIR=/ufs/lsfhpc/log-可供root用户和LSF管理员在管理主机和CrayLin
将 OpenLDAP 与 IBM Spectrum LSF 集成小信瑞 LSF任务调度系统集群管理计算资源管理 php 服务器开发语言管理 LSF 集群配置 LSF 集群对 LSF 集群的访问权工作负载管理平台
IBMSpectrumLSF是一个工作负载管理平台，提供强大的资源管理功能来优化应用程序性能和最大限度提高资源使用率。OpenLDAP是轻量级目录访问协议(LDAP)的开放式源代码实现，提供集中式认证和目录服务。通过遵循本教程中概述的步骤，您可以将OpenLDAP与IBMSpectrumLSF集成，这使您能够使用现有LDAP基础结构进行认证，从而使用户能够更安全，更简化地访问IBMSpectrum
配置和使用 IBM Spectrum LSF 集成小信瑞 LSF任务调度系统计算资源管理集群管理 HPC 高性能计算 LSF 作业调度系统集群管理集群
IBMSpectrumLSF会话调度程序IBMSpectrumLSFSessionScheduler安装，管理和使用IBMSpectrumLSFSessionScheduler。通过使用作业级任务调度程序，在单个LSF作业的分配范围内，运行大量短期任务的集合，该任务级任务调度程序为该作业分配一次资源，并为每个任务重用分配的资源。IBMSpectrumLSFSessionScheduler是运行短作
线程的使用2 进击的菜鸟子进程线程算法 linux c语言服务器网络
3.利用管道实现互相的发收通信jack.c#include#include#include#include#include#include#include#include//有名管道进程间通信void*read_thread(void*argc){//第二根管子if(access("/home/lsf/jincheng_course/fifo2",F_OK)==-1){//fifo2不存在则创建管
集群调度LSF及bsub相关命令在芯片验证中的应用 Bug_Killer_Master 技术百科服务器集成学习 fpga开发经验分享
1.前言：LSF（LoadSharingFacility）是IBM旗下的一款分布式集群管理系统软件，负责计算资源的管理和批处理作业的调度。它给用户提供统一的集群资源访问接口，让用户透明地访问整个集群资源。它通常是高性能计算环境中不可或缺的基础软件。LSF是一种强大的工作负载管理平台，提供基于各种策略的智能调度功能，利用分布式的基础架构资源来提高整体的系统性能。用户通过LSF可以实现集群间的负载均衡
Linux下的文件IO之系统IO 进击的菜鸟子 Linux下的文件IO linux c语言
1.知识点读入写出，切记以我们程序为中心向文件或者别的什么东西读入写出（输入流输出流）人话就是文件向我们程序就是读入程序向文件或者别的什么就是写出2.open打开文件open.c/*************************************************************************>FileName:open.c>Author:lsf>Mail:lsf_
enfuzion与lsf构建渲染集群_渲染农场 - 秦瑞It行程实录 - 博客园 weixin_39972519
渲染农场，从诞生起就是跟“大”联系在一起的。大项目，大团队，大集群，随着渲染农场诞生，其关键词就是———大！超级大！但是由于客户对CG的品质要求提升得越来越快，所以一定品质的CG画面都需要大量的时间进行渲染。著名的摩尔定律至今仍然在发挥着作用，他预言每18个月硬件性能能提高一倍。以前只有好莱坞大型制作公司才能支付得起的渲染农场，现在对于10—40人左右的团队也可以通过细心规划和设计来搭建可以满足自
IBM Spectrum LSF 作业调度系统，简化计算集群管理并划分工作负载优先级小信瑞 LSF任务调度系统计算资源管理集群管理计算资源管理集群管理服务器高性能计算 HPC
IBMSpectrumLSF作业调度系统，简化计算集群管理并划分工作负载优先级要点通过隐藏工作负载密集型计算环境的复杂性，提高用户生产效率；简化计算集群管理并划分工作负载优先级，以快速满足竞争激烈的行业瞬息万变的需求；通过优化系统的利用率，最大限度提高投资回报率(ROI)。高性能计算（HPC）不再只是适用于大型组织和技术技能熟练的用户。在当今竞争激烈的商业环境中，几乎每个行业都要求缩短设计周期并
IBM Spectrum LSF Session Scheduler（会话调度程序）提高总体集群利用率和系统性能小信瑞集群管理计算资源管理 LSF任务调度系统 java 集群管理高性能计算 IBM LSF LSF 任务调度系统计算资源管理作业管理软件
IBMSpectrumLSFSessionScheduler提高总体集群利用率和系统性能LSFSessionScheduler使用户能够使用作业级任务调度程序在单个LSF作业的分配中运行短期任务的大型集合，该调度程序为作业分配一次资源，并对每个任务复用已分配的资源。LSFSessionScheduler实现了分层的个人调度范式，可提供非常低的延迟执行。由于每个作业的等待时间非常短，因此LSFSes
IBM Spectrum LSF社区版下载 mactonald 大数据
1、链接WheredoIdownloadLSFCommunityEdition2、注册IBMid这里注意Countryorregionofresidence选择，UnitedStatesofAmerica注册完成后也可以在profile中更改3、准备美国IP的梯子进入到下面界面，如果没有漂亮国IP，会报出口管制问题，不允许下载。地点和IP都是漂亮国，就可以正常下载了。
LSF安装部署在峡江的转弯处 linux 运维 centos 云计算
前言目前，市面上主流的HPC调度器分为LSF、SGE、Slurm、SGE四大类型，不同行业根据自身场景和不同调度器对应用支持力度的不同，往往会有不同的偏好，在芯片设计公司中最常用的是LSFLSF（LoadSharingFacility）是IBM旗下的一款分布式集群管理系统软件，负责计算资源的管理和批处理作业的调度。它给用户提供统一的集群资源访问接口，让用户透明地访问整个集群资源。同时提供了丰富的功
LSF_故障冗余 IForFree
故障冗余及自动管理故障切换LSF拥有健壮的架构设计，重要组件由另一个组件监视，并可以从故障中自动恢复。即便集群中的一些主机不可用，甚至是管理主机不可用，LSF集群仍可以通过候选管理主机保证集群整体的可用性。LSF可以冗余集群中的任何主机组，当主机变成不可用，主机上运行的JOB将被排队或丢失，这完全取决于此job在运行是否被配置为rerunnable。并不会影响其他正在运行或挂起的作业。怎样实现故障
lsf_10.1 安装教程李艳青 1987 工具安装配置 linux ssh 运维
作为测试，在VMware上安装centos7，然后在centos7上安装lsf10.2社区版(其实是10.1版本)，社区版和正式版的安装基本一致，不过不需要license。redhat上安装步骤基本相同。下面是具体的安装步骤：机器设置1.1设置hostname及IP通过修改/etc/hostname设置机器名，机器名不要太长，也不要带特殊字符。通过ifconfig指令获取机器的IP地址。注意如果是
集合框架天子之骄 java 数据结构集合框架
集合框架集合框架可以理解为一个容器，该容器主要指映射(map)、集合(set)、数组(array)和列表(list)等抽象数据结构。从本质上来说，Java集合框架的主要组成是用来操作对象的接口。不同接口描述不同的数据类型。简单介绍： Collection接口是最基本的接口，它定义了List和Set，List又定义了LinkLi
Table Driven（表驱动）方法实例 bijian1013 java enum Table Driven 表驱动
实例一： /** * 驾驶人年龄段 * 保险行业，会对驾驶人的年龄做年龄段的区分判断 * 驾驶人年龄段：01-[18,25);02-[25,30);03-[30-35);04-[35,40);05-[40,45);06-[45,50);07-[50-55);08-[55,+∞) */ public class AgePeriodTest { //if...el
Jquery 总结 cuishikuan java jquery Ajax Web jquery方法
1.$.trim方法用于移除字符串头部和尾部多余的空格。如：$.trim(' Hello ') // Hello2.$.contains方法返回一个布尔值，表示某个DOM元素（第二个参数）是否为另一个DOM元素（第一个参数）的下级元素。如：$.contains(document.documentElement, document.body); 3.$
面向对象概念的提出麦田的设计者 java 面向对象面向过程
面向对象中，一切都是由对象展开的，组织代码，封装数据。在台湾面向对象被翻译为了面向物件编程，这充分说明了，这种编程强调实体。下面就结合编程语言的发展史，聊一聊面向过程和面向对象。 c语言由贝尔实
linux网口绑定被触发 linux
刚在一台IBM Xserver服务器上装了RedHat Linux Enterprise AS 4，为了提高网络的可靠性配置双网卡绑定。一、环境描述我的RedHat Linux Enterprise AS 4安装双口的Intel千兆网卡，通过ifconfig -a命令看到eth0和eth1两张网卡。二、双网卡绑定步骤： 2.1 修改/etc/sysconfig/network
XML基础语法肆无忌惮_ xml
一、什么是XML？ XML全称是Extensible Markup Language，可扩展标记语言。很类似HTML。XML的目的是传输数据而非显示数据。XML的标签没有被预定义，你需要自行定义标签。XML被设计为具有自我描述性。是W3C的推荐标准。二、为什么学习XML？用来解决程序间数据传输的格式问题做配置文件充当小型数据库三、XML与HTM
为网页添加自己喜欢的字体知了ing 字体秒表 css
@font-face { font-family: miaobiao;//定义字体名字 font-style: normal; font-weight: 400; src: url('font/DS-DIGI-e.eot');//字体文件 } 使用： <label style="font-size:18px;font-famil
redis范围查询应用-查找IP所在城市矮蛋蛋 redis
原文地址： http://www.tuicool.com/articles/BrURbqV 需求根据IP找到对应的城市原来的解决方案 oracle表（ip_country）：查询IP对应的城市： 1.把a.b.c.d这样格式的IP转为一个数字，例如为把210.21.224.34转为3524648994 2. select city from ip_
输入两个整数，计算百分比 alleni123 java
public static String getPercent(int x, int total){ double result=(x*1.0)/(total*1.0); System.out.println(result); DecimalFormat df1=new DecimalFormat("0.0000%");
百合——————>怎么学习计算机语言百合不是茶 java 移动开发
对于一个从没有接触过计算机语言的人来说，一上来就学面向对象，就算是心里上面接受的了，灵魂我觉得也应该是跟不上的，学不好是很正常的现象，计算机语言老师讲的再多，你在课堂上面跟着老师听的再多，我觉得你应该还是学不会的，最主要的原因是你根本没有想过该怎么来学习计算机编程语言，记得大一的时候金山网络公司在湖大招聘我们学校一个才来大学几天的被金山网络录取，一个刚到大学的就能够去和
linux下tomcat开机自启动 bijian1013 tomcat
方法一：修改Tomcat/bin/startup.sh 为: export JAVA_HOME=/home/java1.6.0_27 export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:. export PATH=$JAVA_HOME/bin:$PATH export CATALINA_H
spring aop实例 bijian1013 java spring AOP
1.AdviceMethods.java package com.bijian.study.spring.aop.schema; public class AdviceMethods { public void preGreeting() { System.out.println("--how are you!--"); } } 2.beans.x
[Gson八]GsonBuilder序列化和反序列化选项enableComplexMapKeySerialization bit1129 serialization
enableComplexMapKeySerialization配置项的含义 Gson在序列化Map时，默认情况下，是调用Key的toString方法得到它的JSON字符串的Key，对于简单类型和字符串类型，这没有问题，但是对于复杂数据对象，如果对象没有覆写toString方法，那么默认的toString方法将得到这个对象的Hash地址。 GsonBuilder用于
【Spark九十一】Spark Streaming整合Kafka一些值得关注的问题 bit1129 Stream
包括Spark Streaming在内的实时计算数据可靠性指的是三种级别： 1. At most once，数据最多只能接受一次，有可能接收不到 2. At least once, 数据至少接受一次，有可能重复接收 3. Exactly once 数据保证被处理并且只被处理一次，具体的多读几遍http://spark.apache.org/docs/lates
shell脚本批量检测端口是否被占用脚本 ronin47
#!/bin/bash cat ports |while read line do#nc -z -w 10 $line nc -z -w 2 $line 58422>/dev/null2>&1if[ $?-eq 0]then echo $line:ok else echo $line:fail fi done 这里的ports 既可以是文件
java-2.设计包含min函数的栈 bylijinnan java
具体思路参见：http://zhedahht.blog.163.com/blog/static/25411174200712895228171/ import java.util.ArrayList; import java.util.List; public class MinStack { //maybe we can use origin array rathe
Netty源码学习-ChannelHandler bylijinnan java netty
一般来说，“有状态”的ChannelHandler不应该是“共享”的，“无状态”的ChannelHandler则可“共享” 例如ObjectEncoder是“共享”的, 但 ObjectDecoder 不是因为每一次调用decode方法时，可能数据未接收完全（incomplete），它与上一次decode时接收到的数据“累计”起来才有可能是完整的数据，是“有状态”的 p
java生成随机数 cngolon java
方法一： /** * 生成随机数 * @author [email protected] * @return */ public synchronized static String getChargeSequenceNum(String pre){ StringBuffer sequenceNum = new StringBuffer(); Date dateTime = new D
POI读写海量数据 ctrain 海量数据
import java.io.FileOutputStream; import java.io.OutputStream; import org.apache.poi.xssf.streaming.SXSSFRow; import org.apache.poi.xssf.streaming.SXSSFSheet; import org.apache.poi.xssf.streaming
mysql 日期格式化date_format详细使用 daizj mysql date_format 日期格式转换日期格式化
日期转换函数的详细使用说明 DATE_FORMAT(date,format) Formats the date value according to the format string. The following specifiers may be used in the format string. The&n
一个程序员分享8年的开发经验 dcj3sjt126com 程序员
在中国有很多人都认为IT行为是吃青春饭的，如果过了30岁就很难有机会再发展下去!其实现实并不是这样子的，在下从事.NET及JAVA方面的开发的也有8年的时间了，在这里在下想凭借自己的亲身经历，与大家一起探讨一下。明确入行的目的很多人干IT这一行都冲着“收入高”这一点的，因为只要学会一点HTML, DIV+CSS，要做一个页面开发人员并不是一件难事，而且做一个页面开发人员更容
android欢迎界面淡入淡出效果 dcj3sjt126com android
很多Android应用一开始都会有一个欢迎界面，淡入淡出效果也是用得非常多的，下面来实现一下。主要代码如下： package com.myaibang.activity; import android.app.Activity;import android.content.Intent;import android.os.Bundle;import android.os.CountDown
linux 复习笔记之常见压缩命令 eksliang tar解压 linux系统常见压缩命令 linux压缩命令 tar压缩
转载请出自出处:http://eksliang.iteye.com/blog/2109693 linux中常见压缩文件的拓展名 *.gz gzip程序压缩的文件 *.bz2 bzip程序压缩的文件 *.tar tar程序打包的数据，没有经过压缩 *.tar.gz tar程序打包后，并经过gzip程序压缩 *.tar.bz2 tar程序打包后，并经过bzip程序压缩 *.zi
Android 应用程序发送shell命令 gqdy365 android
项目中需要直接在APP中通过发送shell指令来控制lcd灯，其实按理说应该是方案公司在调好lcd灯驱动之后直接通过service送接口上来给APP，APP调用就可以控制了，这是正规流程，但我们项目的方案商用的mtk方案，方案公司又没人会改，只调好了驱动，让应用程序自己实现灯的控制，这不蛋疼嘛！！！！发就发吧！一、关于shell指令：我们知道，shell指令是Linux里面带的
java 无损读取文本文件 hw1287789687 读取文件无损读取读取文本文件 charset
java 如何无损读取文本文件呢？以下是有损的 @Deprecated public static String getFullContent(File file, String charset) { BufferedReader reader = null; if (!file.exists()) { System.out.println("getFull
Firebase 相关文章索引 justjavac firebase
Awesome Firebase 最近谷歌收购Firebase的新闻又将Firebase拉入了人们的视野，于是我做了这个 github 项目。 Firebase 是一个数据同步的云服务，不同于 Dropbox 的「文件」，Firebase 同步的是「数据」，服务对象是网站开发者，帮助他们开发具有「实时」（Real-Time）特性的应用。开发者只需引用一个 API 库文件就可以使用标准 RE
C++学习重点 lx.asymmetric C++笔记
1.c++面向对象的三个特性：封装性，继承性以及多态性。 2.标识符的命名规则：由字母和下划线开头，同时由字母、数字或下划线组成；不能与系统关键字重名。 3.c++语言常量包括整型常量、浮点型常量、布尔常量、字符型常量和字符串性常量。 4.运算符按其功能开以分为六类：算术运算符、位运算符、关系运算符、逻辑运算符、赋值运算符和条件运算符。 &n
java bean和xml相互转换 q821424508 java bean xml xml和bean转换 java bean和xml转换
这几天在做微信公众号做的过程中想找个java bean转xml的工具，找了几个用着不知道是配置不好还是怎么回事，都会有一些问题，然后脑子一热谢了一个javabean和xml的转换的工具里，自己用着还行，虽然有一些约束吧，还是贴出来记录一下顺便你提一下下，这个转换工具支持属性为集合、数组和非基本属性的对象。 packag
C 语言初级位运算 1140566087 位运算 c
第十章位运算 1、位运算对象只能是整形或字符型数据，在VC6.0中int型数据占4个字节 2、位运算符：运算符作用 ~ 按位求反 << 左移 >> 右移 & 按位与 ^ 按位异或 | 按位或他们的优先级从高到低； 3、位运算符的运算功能： a、按位取反： ~01001101 = 101
14点睛Spring4.1-脚本编程 wiselyman spring4
14.1 Scripting脚本编程脚本语言和java这类静态的语言的主要区别是:脚本语言无需编译,源码直接可运行; 如果我们经常需要修改的某些代码,每一次我们至少要进行编译,打包,重新部署的操作,步骤相当麻烦; 如果我们的应用不允许重启,这在现实的情况中也是很常见的; 在spring中使用脚本编程给上述的应用场景提供了解决方案,即动态加载bean; spring支持脚本