Data & Report Terminology

All of our reports and metrics link back to this page to allow you to understand the terms we use and how we derive our data.

Basic Terms

Some common terms used throughout this website.

  • Project - A group created for accessing HPC resources.
  • Account - The name of the project. This is usually equivalent to the Unix group name.
  • Funds - A balance which has been assigned to a project for the use of funded resource partitions.
  • Slurm - The HPC scheduling software we use which tracks job use by accounts.
  • Partition - A named set of resources that jobs are submitted to and which may or may not be restricted to projects with funds.
  • Facility - A specific hardware installation that offers one or more partitions to run jobs.
  • Job - A task started on a partition, scheduled by Slurm. Running a job always has a cost, whether actual or indicative.
  • Year - Where we make reference to a year, we always mean the calendar year, running January - December.
  • Wasted Hours - Time spent running jobs where the job did not finish correctly.
  • Wasted Costs - The cost of resources expended on wasted hours.
Slurm Yearly Project Report

The Slurm Yearly Report gives an overview of resource utilisation and costs for a single project over a single calendar year.


Summary

The top section of the Slurm Yearly Report gives a quick summary of the project status for the year. The following fields are presented:

  • Total Hours of Compute - A sum of the runtime of all the Slurm jobs launched with the account code of this project in the calendar year. Note that this specifically includes jobs started in the given year; if this jobs ran over into the following calendar year they are still included due to the year in which they started.
  • Actual Costs - A sum of the costs for all jobs launched with the account code of the project in the calendar year against a funded, premium or VIP partition. You should expect to be invoiced for these costs. See below for Job cost calculations.
  • Indicative Costs - A sum of the costs for all jobs launched with the account code of the project in the calendar year against any unfunded or free partitions. This is an indication of the University contribution to your activity through free compute resources - you are not charged for this use. See below for Job cost calculations.
  • Environmental Impact - A calculation of the energy used by the total hours of compute for each of the resource types you have used, as well as an estimate of the equivalent amount of CO2 emissions produced. See below for energy and CO2 emission calculations.

Yearly Job Costs

The Yearly Job Costs section breaks down the resource utilisation and costs of your project by each resource type or partition that has been consumed within Slurm over a single calendar year. The following fields are presented:

  • Partition - The name of the Slurm resource partition which this row relates to. Links to a year-long report for that specific partition.
  • Total Jobs - The total number of jobs which were submitted to this partition in the listed year. This includes jobs submitted against the listed year, but completed the following year. This figure includes all possible job states; both completed and errors.
  • Total Errors - The total number of jobs which were submitted to this partition in the listed year which did not complete with a successful status. All exit status codes other than completed are considered errors.
  • Error Rate - The rate of errors for this partition in this year.
    • Derived as follows: (Total Errors ÷ Total Jobs) = Error Rate %.
    A different error rate will result in a different colour/style being applied to the cell. See below for current error rate thresholds.
  • Total Compute Hours - A sum of the runtime of all of the Slurm jobs launched with the account code of this project in the calendar year for this named partition. This includes both jobs which completed successfully and those which did not.
  • Actual Cost - A sum of the costs for all jobs launched with the account code of the project in the calendar year against this partition if this is a funded, premium or VIP partition. You should expect to be invoiced for these costs. See below for Job cost calculations.
  • Indicative Cost - A sum of the costs for all jobs launched with the account code of the project in the calendar year against this partition if this is an unfunded or free partition. You will not be invoiced for these costs. See below for Job cost calculations.
  • Energy Use - An estimate of the energy used by the total compute hours this project used in this partition. See below for energy calculations.
  • CO2 Equivalent - An estimate of the equivalent amount of CO2 emissions produced by the total amount of compute hours this project used in this partition. See below for CO2 emission calculations.

Yearly Job Performance Data

The Yearly Job Performance Data section attempts to model the size and shape of the workloads of your project by each resource type or partition that has been consumed within Slurm over a single calendar year. The following fields are presented:

  • Partition - The name of the Slurm resource partition which this row relates to.
  • Min Runtime - The minimum length of time, in seconds, of a job submitted to this partition ran for. If this is zero, then it is likely that at least one job exited immediately after being scheduled; i.e. it did not run as expected.
  • Max Runtime - The maximum length of time, in seconds, of a job submitted to this partition ran for.
  • Mean Runtime - The average length of time, in seconds, of jobs submitted to this partition run for.
    • Derived as follows: (Sum Of Runtime Of All Jobs In This Partition ÷ Partition Total Jobs) = Mean Runtime Per Job.
  • Min Cores - The fewest number of CPU cores allocated to a job submitted to this partition.
  • Max Cores - The greatest number of CPU cores allocated to a job submitted to this partition.
  • Mean Cores - The average number of cores allocate to jobs submitted to this partition.
    • Derived as follows: (Sum Of All Cores Allocated For all Jobs In This Partition ÷ Partition Total Jobs) = Mean Cores Per Job.
  • Min RAM - The least amount of RAM allocated to a job submitted to this partition.
    • The RAM for a job is calculated as: (RAM Per Core × Cores Per Job = Total RAM Per Job).
  • Max RAM - The most amount of RAM allocated to a job submitted to this partition.
    • The RAM for a job is calculated as: (RAM Per Core × Cores Per Job = Total RAM Per Job).
  • Mean RAM - The average amount of RAM allocated to jobs submitted to this partition.
    • Derived as follows: (Total Amount Of RAM Allocated For All Jobs In This Partition ÷ Partition Total Jobs) = Mean RAM Per Job.

Month Summaries

The Month Summaries section allows you to jump to the Slurm Monthly Project Report for each month in the year for the project which is being viewed. The following summary fields are presented in the table:

  • Month - The month report which will be displayed if the link is followed.
  • Compute Hours - A sum of the runtime of all the Slurm jobs launched with the account code of this project in the shown month. This includes any jobs started in the listed month, but which ended in the following month.
  • Actual Costs - A sum of the costs for all jobs launched with the account code of the project in the shown month, against a funded, premium or VIP partition. See below for Job cost calculations.
  • Indicative Costs - A sum of the costs for all jobs launched with the account code of the project in the calendar year against any unfunded or free partitions. See below for Job cost calculations.
  • Energy Used - An estimate of the amount of energy used by all compute resources used by this project in the shown month. See below for energy calculations.
  • CO2 Equivalent - An estimate of the equivalent amount of CO2 emissions produced by the total amount of compute hours this project used in this partition for the listed month. See below for CO2 emission calculations.
Slurm Monthly Project Report

The Slurm Monthly Report gives an overview of resource utilisation and costs for a single project over a single calendar month for a single calendar year.


Slurm Monthly Project Partition Report

The Slurm Monthly Partition Report gives an overview of resource utilisation and costs for a single partition used by a single project over a single calendar month for a single calendar year.


Job Cost Calculations

Costs are levied on jobs which are run by Slurm on behalf of a project.


Actual Costs

Actual costs are costs incurred by a project for running compute jobs against resource partitions which are funded, premium or VIP. Normally a project will only have access to these Slurm partitions if the project has been configured with a balance of funds. Funded partitions are not normally available to projects which are unfunded.

Any costs listed in an actual cost field will be expected to be raised as an invoice against the project and will be applied in arrears to any funds you have credited against your project.

For a full list of which partitions are available, and which have funded or unfunded resources, please refer to the HPC Resource List; all resource types clearly indicate the percentage available for funded vs unfunded use.

The actual cost for a single job using a CPU-only resource is calculated as:

  • (Hourly Compute Rate Of Resource × Number Of CPU Cores Allocated × Job Runtime In Hours) = Total Cost Per Job.

The actual cost for a single job using a GPU resource is calculated as:

  • (Hourly Compute Rate Of Resource × Number Of GPU Cards Allocated × Job Runtime In Hours) = Total Cost Per Job.

Be aware that if you submit a job to a GPU resource partition, you will be charged a minimum of one GPU card, whether you request GPU resources in your Slurm job or not. CPU-only jobs should not be submitted to GPU resource partitions.


Indicative Costs

Indicative costs are theoretical costs incurred by a project for running compute jobs against unfunded resource partitions. These are resources which the University has made available to all users as part of the basic provision. Unfunded partitions are normally available to all users.

Any costs listed in an indicative cost field will not be raised as an invoice against the project. These figures are intended to show the relative financial contribution the University has made to your project.

For a full list of which partitions are available, and which have funded or unfunded resources, please refer to the HPC Resource List; all resource types clearly indicate the percentage available for funded vs unfunded use.

Energy Use Calculations
CO2 Equivalent Calculations
Error Rate Thresholds

Error rates indicate the number of jobs submitted against a partition which resulted in a non-completed status. Note that error rates apply to the absolute number of jobs with errors vs jobs which completed normally, and not the ratio of successful job runtime to error job runtime - the absolute number of hours of errors is shown independently.

  • High Error Rate - Error rates >=30% are classed as high and will show a warning banner at the top of any report page in which a moderate error is encountered. An error rate at this level almost certainly indicates that a substantial amount of resources are being wasted on incorrectly written scripts or jobs which have not been optimised for the partition they have been launched on. Cases of these errors must be investigated as a priority.
  • Moderate Error Rate - Error rates 15% - 30% are classed as moderate and will show a warning banner at the top of any report page in which a moderate error is encountered. An error rate at this level may indicate that some further optimisation of resources is needed, or that there are some errors present in the scripts or code.
  • Low Error Rate - Error rates 7.5% - 15% are classed as low.
  • Minimal Error Rate - Error rates 1% - 7.5% are classed as minimal.