Slurm: Cheatsheet / Summary
Submit batch jobs:
sbatch
.Run/launch a parallel job:
srun
.Request interactive job sessions:
sinteractive
.- Cancelling jobs:
Cancel a single job:
scancel jobid
orscancel -n jobname
.Cancel all your jobs:
scancel -u username
.Cancel all your pending jobs:
scancel -t PD
.
Job control and monitoring:
scontrol
,squeue
andshowq
.Currently running job detailed status:
sstat
andsjob
.Job usage summary:
jobreport
.Nodes info and cluster status:
sinfo
,showbf
andqinfo -v
.Job and job steps accounting data:
sacct
.
Slurm keywords
Some common sbatch Options/Directives
Short Format |
Long Format |
Description |
---|---|---|
-N count |
--nodes=count |
Used to allocate [count] nodes to your job. |
N/A |
--ntasks-per-node=count |
Use [count] of MPI tasks per node |
-c count |
--cpus-per-task=count |
Set the value as "number of of logical cores (CPUs) per MPI task". Do not set this usually. Defaults to 1. |
-t DD-HH:MM:SS |
--time=DD-HH:MM:SS |
Always specify the maximum wallclock time for your job. |
N/A |
--mem=count |
Allow your job to use up to [count] MB of memory on each node in your job. |
N/A |
--mem-per-cpu=count |
Allow your job to use up to [count] MB of memory for each cpu in your job. |
N/A |
--tmp=X |
eg. --tmp=20GB. Request temporary file space on the local disk (SSD or NVMe) on each node in your job. The environment variable $JOBFS points to this directory. |
-J job_name |
--job-name=job_name |
job_name: up to 15 printable, non-whitespace characters. |
-e filename |
--error=filename |
Write STDERR to filename |
-o filename |
--output=filename |
Write STDOUT to filename. By default both standard output and standard error are directed to a file of the name "slurm-%j.out", where the "%j" is replaced with the job id. See the -i option for filename specification options. |
-i filename_pattern |
--input=filename_pattern |
Instruct SLURM to connect the batch script's standard input directly to the file name specified in the "filename pattern".The filename pattern may contain one or more replacement symbols, which are a percent sign "%" followed by a letter (e.g. %j).Supported replacement symbols are:%jJob id.%NNode name. Only one file is created, so %N will be replaced by the name of the first compute node in the job, which is the one that runs the script. |
N/A |
--mail-type=events --mail-user=address |
Valid event values are: BEGIN, END, FAIL, REQUEUE, ALL (equivalent to BEGIN, END, FAIL, REQUEUE, and STAGE_OUT), STAGE_OUT (burst buffer stage out completed), TIME_LIMIT, TIME_LIMIT_90 (reached 90 percent of time limit), TIME_LIMIT_80 (reached 80 percent of time limit), and TIME_LIMIT_50 (reached 50 percent of time limit). Multiple type values may be specified in a comma separated list. The user to be notified is indicated with --mail-user. Mail notifications on job BEGIN, END and FAIL apply to a job array as a whole rather than generating individual email messages for each task in the job array. |
-D directory_name |
--workdir=directory_name |
Set the working directory of the batch script to directory_name before it is executed. The path can be specified as full path or relative path to the directory where the sbatch command is executed. |
See https://slurm.schedmd.com/sbatch.html for more information.