Table of Contents

Simple Slurm Tools

A set of additional tools for working with, reporting on, or interacting with HPC systems using standard tools such as Slurm and lmod.

Most of these tools are not going to be used everyday, but can make it easier to pull data out of Slurm, understand why a job is not running, or analyse system utilisation.

No modules are necessary to run these tools; you only need a version of python3 and the module commands available.


shistory

The shistory tools prints some simple time-series metrics against the Slurm job database to visualise job submissions and scheduler utilisation.

Available options:

$ shistory --h
usage: sjobs [-h] [-csv] [-csv_user] [-keyfield KEYFIELD] [-keytype KEYTYPE] [-day] [-week] [-month] [-year] [-periods PERIODS] [-pc PC]

optional arguments:
  -h, --help          show this help message and exit
  -csv                Enable CSV summary output only [default disabled].
  -csv_user           Enable CSV user stats output only [default disabled].
  -keyfield KEYFIELD  One of [runtime, waittime, cores, nodes, ramcore, cpuhours], [default is runtime].
  -keytype KEYTYPE    Data type to use for the keyfield, one of [min, max, mean, total], [default is total].
  -day                Reports are in periods of one day.
  -week               Reports are in periods of one week [default].
  -month              Reports are in periods of one month.
  -year               Reports are in periods of one year.
  -periods PERIODS    Total number of reporting periods to produce history for [default is 1].
  -pc PC              Percentile figure for reports [defaults is 75].

The default is to display metrics for a single week:

$ shistory
Report period	: week
Report count	: 1
Percentile	: 75%

Please wait, starting retrieval of job data...
Please wait, analysing job data...

Period (type)     Jobs -   CPUHours -  RunTime -                WaitTime -               Cores -              Nodes -           RAM/Core -
                  Total    Total       Min/Max/Mean/75%         Min/Max/Mean/75%         Min/Max/Mean/75%     Min/Max/Mean/75%  Min/Max/Mean/75%
=============     =======  ==========  =======================  =======================  ===================  ================  ===============================
2026-05-11 week   34754    161367.9        0/ 9045/   29/   11      0/ 1440/  945/ 1211     1/ 256/   3/   1        1/ 7/ 1/ 1      128/ 409600/   2505/   1024

You can generate reports over multiple periods by adding the -periods parameter. Example below for 6 days:

$ shistory -periods 6 -day

Report period	: day
Report count	: 6
Percentile	: 75%

Please wait, starting retrieval of job data...
Please wait, analysing job data...

Period (type)     Jobs -   CPUHours -  RunTime -                WaitTime -               Cores -              Nodes -           RAM/Core -
                  Total    Total       Min/Max/Mean/75%         Min/Max/Mean/75%         Min/Max/Mean/75%     Min/Max/Mean/75%  Min/Max/Mean/75%
=============     =======  ==========  =======================  =======================  ===================  ================  ===============================
2026-05-12 day    3250     36030.3         0/ 5819/   66/   39      9/ 1433/  976/ 1286     1/ 256/   6/  10        1/ 1/ 1/ 1      128/ 409600/   4653/   4096 
2026-05-13 day    1987     35124.6         0/ 9045/  120/  272      0/ 1436/  834/ 1307     1/ 256/   9/  10        1/ 1/ 1/ 1      128/ 409600/   4119/   5120 
2026-05-14 day    1446     20298.9         0/ 4134/   95/   26      8/ 1439/  760/ 1258     1/ 256/   5/   1        1/ 4/ 1/ 1      128/  44800/   4024/   4096 
2026-05-15 day    10036    24015.1         0/ 2887/   16/   11      0/ 1440/ 1007/ 1247     1/ 256/   2/   1        1/ 1/ 1/ 1      128/  92160/   1807/   1024 
2026-05-16 day    3124     17523.4         0/ 4236/   28/   17     19/ 1333/  601/  682     1/ 256/   2/   1        1/ 7/ 1/ 1      128/  44800/   1497/   1024 
2026-05-17 day    290      11706.9         0/ 2618/   63/    0     65/ 1264/  780/  788     1/ 200/   7/   8        1/ 1/ 1/ 1     1024/  44800/   6394/   8192

Data can be output in CSV format by appending the -csv option. Example:

$ shistory -periods 6 -day -csv
date,period,jobs (total),cpu hours (total),job runtime (total),job runtime (min),job runtime (max),job runtime (mean),job runtime (75%),job waittime (total),job waittime (min),job waittime (max),job waittime (mean),job waittime (75%),job cores (min),job cores (max),job cores (mean),job cores (75%),job nodes (min),job nodes (max),job nodes (mean),job nodes (75%),ram per core (min),ram per core (max),ram per core (mean),ram per core (75%)
2026-05-12,day,3250,36030.30361111109,214961.3333333321,0.0,5818.75,66.14194871794834,39.13333333333333,3175350.7000000007,10.133333333333333,1434.4833333333333,977.0309846153848,1287.1833333333334,1,256,6.3218461538461534,10,1,1,1.0,1,128.0,409600.0,4652.953025641026,4096.0
2026-05-13,day,1987,35124.617500000124,238749.00000000128,0.0,9044.783333333333,120.15551082033281,271.93333333333334,1659104.0499999863,1.5666666666666667,1436.9833333333333,834.9793910417646,1308.0166666666667,1,256,8.668847508807247,10,1,1,1.0,1,128.0,409600.0,4119.00035686508,5120.0
2026-05-14,day,1446,20298.91138888913,137364.19999999896,0.0,4133.516666666666,94.99598893499237,25.9,1101173.2000000088,8.9,1439.9333333333334,761.5305670816105,1259.3333333333333,1,256,5.166666666666667,1,1,4,1.0020746887966805,1,128.0,44800.0,4023.921991701245,4096.0
2026-05-15,day,10036,24015.085277777747,158294.81666666773,0.0,2887.1833333333334,15.772699946858083,11.1,10103467.599999985,0.25,1439.9333333333334,1006.7225587883604,1247.8666666666666,1,256,1.9364288561179752,1,1,1,1.0,1,128.0,92160.0,1807.4715025906735,1024.0
2026-05-16,day,3124,17523.400000000085,88880.23333333309,0.0,4236.466666666666,28.450778915919685,16.9,1880613.6000000057,19.966666666666665,1334.6833333333334,601.9889884763143,683.15,1,256,1.6840588988476313,1,1,7,1.0057618437900129,1,128.0,44800.0,1497.2560819462228,1024.0
2026-05-17,day,290,11706.935833333335,18410.566666666527,0.016666666666666666,2618.4166666666665,63.48471264367768,0.13333333333333333,226532.0333333323,66.38333333333334,1265.1666666666667,781.1449425287321,788.9166666666666,1,200,7.172413793103448,8,1,1,1.0,1,1024.0,44800.0,6393.820689655173,8192.0


sjobs

The sjobs tool gives a high level overview of the utilisation of a given Slurm partition right now. It prints a summary for both running and pending jobs in the partition including:

To run the command, use the form: sjobs PARTITION_NAME. For example, for default_free:

$ sjobs default_free
-= Running =-                            -= Pending =-
=============                            =============
Total users              :       12      Total users waiting              :        9
Total running jobs       :      111      Total waiting jobs               :      109
Total allocated cores    :     1966      Total requested cores            :     1110
Total allocated memory   :    16735 GB   Total requested memory           :    24975 GB
Total runtime            :    12372 min  Total waiting time               :     2714 min
-
Largest job (cores)      :      128      Largest waiting job (cores)      :      256
Largest job (memory/job) :     1400 GB   Largest waiting job (memory/job) :     1400 GB
Largest job (memory/core):       43 GB   Largest waiting job (memory/core):      292 GB
Longest job runtime      :     1728 min  Longest waiting time             :      170 min
-
Average job (cores)      :       17      Average waiting job (cores)      :       10
Average job (memory/job) :      150 GB   Average waiting job (memory/job) :      229 GB
Average job (memory/core):        7 GB   Average waiting job (memory/core):       10 GB
Average runtime          :      111 min  Average waiting time             :       24 min
$


sproject

The sproject tool shows a summary of Slurm data for a single Slurm account code (remember, this is the same as your HPC project name; either comet_abc123 or rocket_abc123) project. It is used to show:

To run, the format is: sproject PROJECT_NAME. For example, the project comet_mopm:

In the example above it shows:

This tool is intended to give members of a project a simple interface to understand the restrictions on their account, as well as a quick way of viewing the reasons why their jobs may not have started yet. The tool will attempt to identify all relevant reasons why a job may not be running yet - in some cases there may be more than one reason.


Additional tools included in the Simple Slurm Tools software, which, although not directly related to Slurm, still have some use on HPC systems.


modulespy

The modulespy interrogates a Linux software module and recurses through all listed dependencies which are necessary in order to use it.

For example, to find all of the modules needed in order to load Python/3.12.3:

$ modulespy Python/3.12.3
Searching for all dependencies of: Python/3.12.3
Python/3.12.3 -> GCCcore/13.3.0
Python/3.12.3 -> binutils/2.42-GCCcore-13.3.0
Python/3.12.3 -> bzip2/1.0.8-GCCcore-13.3.0
Python/3.12.3 -> zlib/1.3.1-GCCcore-13.3.0
Python/3.12.3 -> libreadline/8.2-GCCcore-13.3.0
Python/3.12.3 -> ncurses/6.5-GCCcore-13.3.0
Python/3.12.3 -> SQLite/3.45.3-GCCcore-13.3.0
Python/3.12.3 -> XZ/5.4.5-GCCcore-13.3.0
Python/3.12.3 -> libffi/3.4.5-GCCcore-13.3.0
Python/3.12.3 -> OpenSSL/3
binutils/2.42-GCCcore-13.3.0 -> GCCcore/13.3.0
binutils/2.42-GCCcore-13.3.0 -> zlib/1.3.1-GCCcore-13.3.0
zlib/1.3.1-GCCcore-13.3.0 -> GCCcore/13.3.0
bzip2/1.0.8-GCCcore-13.3.0 -> GCCcore/13.3.0
libreadline/8.2-GCCcore-13.3.0 -> GCCcore/13.3.0
libreadline/8.2-GCCcore-13.3.0 -> ncurses/6.5-GCCcore-13.3.0
ncurses/6.5-GCCcore-13.3.0 -> GCCcore/13.3.0
SQLite/3.45.3-GCCcore-13.3.0 -> GCCcore/13.3.0
SQLite/3.45.3-GCCcore-13.3.0 -> libreadline/8.2-GCCcore-13.3.0
SQLite/3.45.3-GCCcore-13.3.0 -> Tcl/8.6.14-GCCcore-13.3.0
Tcl/8.6.14-GCCcore-13.3.0 -> GCCcore/13.3.0
Tcl/8.6.14-GCCcore-13.3.0 -> zlib/1.3.1-GCCcore-13.3.0
XZ/5.4.5-GCCcore-13.3.0 -> GCCcore/13.3.0
libffi/3.4.5-GCCcore-13.3.0 -> GCCcore/13.3.0
$


modulesearch

The modulesearch tool finds any modules which have the stated module as a dependency.

For example, to find all modules which have Python as a dependency:

$ modulesearch -m Python
Searching for modules which have the dependency [Python]
Searching using PARTIAL matches
Search results will be held in [modulesearch.out]
Searching through [2214] modules, please wait:

Searched all packages                                                                              
Found [82] results
Please 'cat modulesearch.out' to see found packages
$

Unlike the other tools, the results from modulesearch are not output to the terminal. Instead, view the contents of modulesearch.out (created in the current directory) to find the matching modules:

$ cat modulesearch.out 
BeautifulSoup/
BeautifulSoup/4.12.3-GCCcore-13.3.0
Biopython/
Biopython/1.84-foss-2024a
Biopython/1.85-foss-2024a
cffi/
cffi/1.16.0-GCCcore-13.3.0
cryptography/
cryptography/42.0.8-GCCcore-13.3.0
Cython/
Cython/3.0.10-GCCcore-13.3.0
flit/
flit/3.9.0-GCCcore-13.3.0
Flye/
Flye/2.9.5-GCC-13.3.0
fonttools/
fonttools/4.53.1-GCCcore-13.3.0
GitPython/
GitPython/3.1.43-GCCcore-13.3.0
hatch-jupyter-builder/
hatch-jupyter-builder/0.9.1-GCCcore-13.3.0
hatchling/
hatchling/1.24.2-GCCcore-13.3.0
hypothesis/
hypothesis/6.103.1-GCCcore-13.3.0
IPython/
IPython/8.28.0-GCCcore-13.3.0
jedi/
jedi/0.19.1-GCCcore-13.3.0
JupyterLab/
JupyterLab/4.2.5-GCCcore-13.3.0
JupyterLab/4.4.3-GCCcore-13.3.0
jupyter-server/
jupyter-server/2.14.2-GCCcore-13.3.0
lit/
lit/18.1.8-GCCcore-13.3.0
lxml/
lxml/5.3.0-GCCcore-13.3.0
Mako/
Mako/1.3.5-GCCcore-13.3.0
maturin/
maturin/1.6.0-GCCcore-13.3.0
Meson/1.4.0-GCCcore-13.3.0
meson-python/
meson-python/0.16.0-GCCcore-13.3.0
PGAP/
PGAP/2025-05-06
poetry/
poetry/1.8.3-GCCcore-13.3.0
PuLP/
PuLP/2.8.0-foss-2024a
pybind11/
pybind11/2.12.0-GCC-13.3.0
Pysam/
Pysam/0.22.1-GCC-13.3.0
Python-bundle-PyPI/
Python-bundle-PyPI/2024.06-GCCcore-13.3.0
PyYAML/
PyYAML/6.0.2-GCCcore-13.3.0
PyZMQ/
PyZMQ/26.2.0-GCCcore-13.3.0
scikit-build/
scikit-build/0.17.6-GCCcore-13.3.0
scikit-build-core/
scikit-build-core/0.10.6-GCCcore-13.3.0
SciPy-bundle/
SciPy-bundle/2024.05-gfbf-2024a
setuptools-rust/
setuptools-rust/1.9.0-GCCcore-13.3.0
snakemake/
snakemake/8.27.0-foss-2024a
SPAdes/4.1.0-GCC-13.3.0
tornado/
tornado/6.4.1-GCCcore-13.3.0
Unicycler/
Unicycler/0.5.1-gompi-2024a
virtualenv/
virtualenv/20.26.2-GCCcore-13.3.0
wrapt/
wrapt/1.16.0-gfbf-2024a
Z3/
Z3/4.13.0-GCCcore-13.3.0

To find exact matches, i.e. including a version suffix, such as Python/3.12.3, use the -e exact option:

$ modulesearch -e -m Python/3.12.3
Searching for modules which have the dependency [Python/3.12.3]
Searching using EXACT matches
Search results will be held in [modulesearch.out]
Searching through [2214] modules, please wait:

Searched all packages                                                                              
Found [2] results
Please 'cat modulesearch.out' to see found packages

… and the results:

$ cat modulesearch.out 
PGAP/
PGAP/2025-05-06


Back to Getting Started