Managing x86 jobs with SLURM

Background

To do work on the Melbourne Bioinformatics (formerly VLSCI) computers, you need to submit jobs to a queue. If resources are available then your job should run shortly afterwards. Whilst your jobs are running you will have the processors and memory dedicated to you that you requested for the amount of time that you have requested.

Job scheduling, resource management and accounting are handled by SLURM (Simple Linux Utility for Resource Management).

Caveat for parallel jobs:

For all parallel jobs please use srun rather than mpirun. These days mpirun will perform far worse than using srun.

Job Management with SLURM

A typical workflow for managing a job is to:

Job script generator

SLURM scripts: SLURM reads a text file that acts as a script of comments and commands. SLURM looks for special comments where the line starts with #SBATCH. Anything on the same line after the #SBATCH comment is interpreted as a SLURM command.

NOTE: SLURM will only look for SLURM commands in the top comments section of a script. Once a non-comment has been found (a command) any SLURM command after that line will NOT be interpreted. If your job seems to ignore your SLURM settings, please check that there are no commands before the SLURM #SBATCH.

To simplify the task of writing job submission scripts we provide an interactive job script generator.

You can modify the script (or make your own) by referring to the Job types section (below).

Job submission

To submit the resource requests to the queue, the sbatch command is used. In conjunction with the job-script, the command is of the following form:

sbatch [command line options] job-script

This command will return a number for the job id of the job, e.g.:

Submitted batch job 94402

Interactive jobs

It is possible to run a job as an interactive session using sinteractive. For example:

sinteractive

or

sinteractive --x11

Will wait until your job runs, then give you a new prompt from the node your job is on and the directory you launched the job from. The --x11 option will forward any X11 windows to your machine (assuming you have X11 and forwarding set up).

All SLURM options can be passed as options to the sinteractive command.

Job monitoring

To view the state of the system use:

showq

or

squeue

To limit the output to your jobs only, use the form:

showq -u

or

squeue -u username

where username is your Melbourne Bioinformatics username.

It is also possible to get more detailed information about a specific job using the scontrol command, see The SLURM documentation for scontrol for more information.

For example, to see the details of a job with job id jobID use:

scontrol show job jobID

To view SLURM accounting details use the sacct command, see The SLURM documentation for sacct for more information.

For example:

sacct -j jobID

Modifying a Job

If you notice that a job is taking longer than expected, please contact the Help Desk with the machine, job id and estimate of the extra time needed.

If you need to cancel your job, use:

scancel jobID

Reviewing a Job

The best way to review a job is to view any output files that it generates and the output and error files generated by SLURM. By default SLURM creates a file containing both the standard output and standard error. This file is named as slurm-jobID.out (where jobID is the job id number).

To monitor and review SLURMs accounting information for the job use:

sacct

This is useful for queued, running and finished jobs.

Optimising your script

An important step in job management it to customise the script to the job. For example, it is always a good idea to do small test runs to estimate the total run time, best number of cores and memory requirements. Underestimating can lead to jobs failing and overestimation will cause the job to remain in the queue longer than necessary.

It is important that you dont request unnecessarily large amounts of memory as it will delay when your jobs start due to the large amount of resources that needs to be freed up. Please see the page on Managing Memory for more information.

Job size limits

Due to the technology available at the time, different clusters have different capabilities in terms of the number of CPUs available on each machine and the maximum amount of memory available.

You can find out these limits at any time using the mylimits command.

Job types

Jobs can be classed as one of three types: single CPU, SMP (or multithreaded), and MPI parallel. Here are some minimal examples for each type.

Single CPU
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=2048
module load my-app-compiler/version
my-app

Bundling single CPU jobs

For a large number of single jobs, it is better to bundle them with a wrapper. For example:

#!/bin/bash
#SBATCH --ntasks=16
#SBATCH --time=01:10:00
#SBATCH --mem-per-cpu=4096
for i in `seq 1 $SLURM_NTASKS`
do
    srun --nodes=1 --ntasks=1 --cpus-per-task=1 sh SINGLEJOB.slurm &
done
# IMPORTANT must wait for all to finish, or all get killed
wait

SMP jobs (also called multithreaded, OpenMP)

#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --mem=22528
module load my-app-compiler/version
my-app

MPI Parallel Job

#!/bin/bash
#SBATCH --ntasks=16
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=1024
module load my-app-compiler/version
srun my-MPI-app

Software and Modules

Melbourne Bioinformatics makes available a range of open source and commercial software on its systems.

The range of available software will depend on the host machine. To see a list of software available on a given machine, log into the machine and type module avail

Common SLURM options

OptionComment
#SBATCH --job-name=JobNameGive the job a name.
#SBATCH --account=VR9999Account under which to run the job.
#SBATCH -p mainResources on a machine can be given a partition.
At Melbourne Bioinformatics the default is called main.
#SBATCH --ntasks=128The number of cores.
#SBATCH --mem-per-cpu=24576Per core memory. Must be in MB.
#SBATCH --time=30-23:59:59Walltime. Note - between days and hours.
Variants are:
  • minutes
  • minutes:seconds
  • hours:minutes:seconds
  • days-hours
  • days-hours:minutes
  • days-hours:minutes:seconds
#SBATCH --mail-type=FAILSend email notification when job fails
#SBATCH --mail-type=BEGINSend email notification when job start running
#SBATCH --mail-user name@email.addressE-mail address to send information to.
Best to not use this, and the system will use your known e-mail address.
#SBATCH -output path/file-%j.ext1Redirect output to this file on the path (optional).
The name and extension can be anything you like. If you use %j then it is replaced by the job number.
#SBATCH -error path/file-%j.ext2Redirect error to this file on the path (optional).
The name and extension can be anything you like.If you use %j then it is replaced by the job number.
1) don't specify any directory
2) cd $HOME
3) cd path
Change to:
1) the launch directory,
2) home directory,
3) specified path. Default is to run the job in the same directory that the job is launched (via 'sbatch'). This is the best option (i.e. don't specify and 'cd').

SMP

If your job is a SMP job, you will need to use the following options in addition to any relevant options from above.

To get the number of cpus that the node has, use:

$SLURM_CPUS_ON_NODE
OptionComment
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
Request that all cores are for 1 task on 1 node. Cannot use more cores than are on 1 node.
#SBATCH --mem=24576 Specify the whole node memory.
Must be in MB. Do not use mem-per-cpu

Job Arrays

If you use Job Arrays, please see the SLURM documentation for job arrays.