Table of Contents

HPC Resources & Partitions - Comet

Partitions for Unfunded Projects

These partitions and resources are available to all Comet users. Beyond our Acceptable Use Policy there are no restrictions on the use of these resources.

Partition Node Types Nodes GPU Open OnDemand Max Resources Default Runtime Maximum Runtime Default Memory
short_free Standard.b 7 No No 10 minutes 30 minutes 1GB per core
default_free Standard.b 7 No No 24 hours 48 hours 1GB per core
long_free Standard.b 7 No No 4 days 14 days 1GB per core
highmem_free Large.b 2 No No 24 hours 5 days 4GB per core
gpu-s_free GPU-S 1 gpu:L40:8 per node No 24 hours 14 days 2GB per core
interactive-std_free Standard.b 7 No Yes 2 hours 8 hours 1GB per core
interactive-gpu_free GPU-S 1 gpu:L40:8 per node Yes 2 hours 8 hours 2GB per core
interactive-hmem_free Large.b 2 No Yes 2 hours 8 hours 4GB per core

Partitions for Funded Projects

These partitions are available to all projects who have allocated funds to their Comet HPC Project accounts.

If you have not allocated funds to your HPC Project, or your balance is negative then you will not be able to submit jobs to these partitions - any submitted jobs will be automatically cancelled.

For further details on paid resource types, see our Billing & Project Funds policy page.

Partition Node Types Nodes GPU Open OnDemand Max Resources Default Runtime Maximum Runtime Default Memory
short_paid Standard.b 29 No No 10 minutes 30 minutes 1GB per core
default_paid Standard.b 29 No No 24 hours 48 hours 1GB per core
long_paid Standard.b 29 No No 4 days 14 days 1GB per core
highmem_paid Large.b 8 No No 24 hours 5 days 4GB per core
gpu-s_paid GPU-S 3 gpu:L40:8 per node No 24 hours 14 days 2GB per core
gpu-l_paid GPU-L 1 gpu:H100:4 per node No 24 hours 14 days 2GB per core
interactive-std_paid Standard.b 29 No Yes 2 hours 8 hours 1GB per core
interactive-gpu_paid GPU-S 3 gpu:L40:8 per node Yes 2 hours 8 hours 2GB per core
low-latency_paid Standard.Lowlatency 4 No No 1024 cores 24 hours 4 days 1GB per core
interactive-hmem_paid Large.b 8 No Yes 2 hours 8 hours 4GB per core

Partition Descriptions

short

The short partition is intended for quick tests, proof of concept runs, debugging and other tasks which can be completed quickly. It is not intended to run entire compute jobs.

Examples

default

It's always necessary to specify a partition e.g. --partition=default_free in your sbatch file

The default partitions (default_free, default_paid) have the largest number of general CPU resources in the Comet HPC facility and are intended to run the bulk of our compute workloads outside of the multi-node MPI / low latency and GPU requirements.

Default runtime is set to 24 hours and default memory allocation is set to 1GB per allocated CPU core. There is no defined maximum memory allocation - this is limited by the size of the Standard.b nodes it is built on.

It is your responsibility to determine the most appropriate runtime (in the range of 0 - 48 hours), required number of CPU cores and memory allocation for your specific application.

Examples

long

The long partition has the same hardware resources as the default partition, since it is based on the same number and type of nodes (Standard.b), however the runtime is extended over default; from a default of 4 to a maximum of 14 days.

Examples

highmem

The capabilities of the highmem partition have been incorporated into the default and low-latency partitions; both of those partitions have the exact same memory capacity. Use default for jobs not requiring low latency comms, and low_latency for large scale MPI workflows > 100's of cores.

The highmem partition allows jobs which need a larger amount of memory to be run. Note that unlike the Standard.a compute nodes of Rocket (128GB), the Comet Standard.b compute nodes are substantially larger (1.1TB), and you may not need to use the Large.b compute nodes (1.5TB) for many large jobs. Note that the Large.b compute nodes are also connected by a faster network; if you need to run large processes across multiple nodes simultaneously via MPI (and they do not fit the low-latency node types), then the use of highmem may be an option for you. By default, jobs submitted to the highmem partition are able to run longer (up to 5 days) than the standard partition (2 days). Though this is not as long as the long partition (14 days). Consider the use of this partition type if your workload needs more than 1TB of memory on a single node, otherwise the standard or long partitions may be more suitable for you.

Examples
<del> * Jobs needing more than 1TB of RAM on a single node
 * Large jobs needing to communicate via low-latency networking</del>

gpu-s

The gpu-s partition uses the GPU-S node type on Comet. These nodes are suitable for most types of GPU-accelerate compute, though please do check whether any of your CUDA/OpenCL code paths require double precision, FP64 capability; the Nvidia L40S datasheet should be checked, as these cards are restricted in that mode. The nodes hosting the L40S cards are also connected via faster networking, just like highmem and low-latency, so you can also take advantage of faster inter-node communication, if you job spans more than one node, as well as faster IO speeds to/from the main NOBACKUP storage.

Any jobs run on the gpu-s partition are costed by the number of GPU cards you request. If your job requests two cards, then it will cost twice as much as a single card in the same amount of time. Users of our paid partitions should be careful when requesting resources via Slurm that you are requesting and using what you actually need. Users of our unpaid gpu-s partition do not have any costs associated with the use of GPU cards, but the number available is strictly limited.

Default runtime is up to 24 hours, but you may request up to a maximum of 14 days. The use of GPU resources is closely monitored.

Examples

gpu-l

The gpu-l partition uses a single node; GPU-L, which contains a very small number of Nvidia H100 cards. These cards represent some of the most powerful GPU compute options currently available. See Nvidia H100 datasheet for further information. This node type is also connected to NOBACKUP via faster networking, just as with gpu-s, highmem and low-latency to take advantage of faster IO read/write facilities.

As with gpu-s, if your job requests two cards, then it will cost twice as much as a single card in the same amount of time. Users should be careful when requesting resources via Slurm that you are requesting and using what you actually need; this partition represents the most costly use of your HPC Project balance. Be certain of your job parameters before you launch a multi-day compute run.

There is no unpaid access to the gpu-l partition (it only exists as gpu-l_paid, and not as gpu-l_free).

All users must be members of at least one HPC Project with a positive balance.

Default runtime, like gpu-s is up to 24 hours, but a maximum of 14 days may be requested.

Examples

low-latency

The low-latency partition is built on several of the Standard.Lowlatency nodes. These are identical to the Standard.b node type in terms of CPU cores, RAM and local scratch storage, but are connected by a much faster 200GBit/Sec Infiniband network architecture.

For users who need to run many hundreds of processes simultaneously, potentially via MPI, the low-latency partition is ideal. There are currently four nodes of this type, this allows for maximum jobs in the partition of up to 1024 cores. This represents a doubling over the capacity of Rocket.

There is no unpaid access to the low-latency partition (it only exists as low-latency_paid, and not low-latency_free).

All users must be members of at least one HPC Project with a positive balance.

Examples

Most users who need the low-latency partition will already know they need to use it, but a summary of the typical requirements of jobs which run there include:


interactive-std

The interactive-std partitions are implemented on top of the Standard.b node type, and shares all of the same resources and characteristics.

The use of this partition is not intended for Slurm batch jobs, but for users who launch a graphical/desktop environment or dedicated application instance using the Open OnDemand interface.

This effectively gives you a massively powerful Linux workstation (up to the size of an entire Standard.b node) which you can book out for short periods of time.

The interactive partition schedules jobs which need a dedicated graphical interface, but not necessary any specific GPU resource; if you need to run a graphical version of Matlab, R Studio, Jupyter, Gaussian, or just a full Linux X11 desktop, then you can request this via Open OnDemand, and your job will be allocated resources from the Standard.b nodes; i.e. using CPU compute and RAM.

Accessing Open OnDemand only needs a web browser; no other special software is needed on your local device.

No GPU acceleration is available to jobs in this partition.

Examples


interactive-gpu

The interactive-gpu partitions are implemented on top of the GPU-S node type, and shares all of the same resources and characteristics. This includes access to the Nvidia L40S 48GB GPU card(s) for a desktop session.

Note: The GPU-L node type is not available for interactive use.

The use of this partition is not intended for Slurm batch jobs, but for users who launch a graphical/interactive environment or dedicated application instance using the Open OnDemand interface and which need dedicated Nvidia GPU resources; e.g. for visualisation or rendering.

This effectively gives you a massively powerful Linux GPU workstation (up to the size of an entire GPU-S node) which you can book out for short periods of time.

Examples

Any application hosted on the HPC facility which runs as an interactive Linux application and can or requires the use of GPU accelerated compute to run, or can benefit from dedicate GPU for OpenGL display or rendering. Including, but not limited to:


interactive-hmem

The interactive-hmem partition is no longer in use - please use interactive-std instead; this shares the same memory capacity.

The interactive-hmem partitions are implemented on top of the Large.b node type, and shares all of the same resources and characteristics. The use of this partition is not intended for Slurm batch jobs, but for users who launch a graphical/interactive environment or dedicated application instance using the Open OnDemand interface and which have large in-memory data requirements. This effectively gives you a massively powerful Linux desktop workstation (up to the size of an entire Large.b node) which you can book out for short periods of time. No GPU acceleration is available on this partition.


Resource Limits

Compared to Rocket, Comet has a substantially greater number of compute resources, though in a slightly smaller physical number of machines (one Comet compute nodes is equivalent to 4 or 5 Rocket nodes in terms of CPU cores, though more than 10 times in terms of RAM capacity).

This means we do need to take some measures to prevent the monopolisation of resources by a small number of projects, users and jobs.

The following standard Slurm trackable resources (TRES) are enabled for Comet:

The limits are applied with one of two possible values. The value applied is based on whether your project is funded (and has a positive remaining balance) or is unfunded (or has a negative remaining balance).

Resource Unfunded Projects Funded Projects Description
MaxSubmitJobs 128 512 The maximum number of jobs the account code / project can have in the PENDING state at any time. Once this limit is reached you may not submit any further jobs, until the PENDING number has reduced.
MaxJobs 256 1024 The maximum number of jobs the account code / project can have RUNNING at any point in time. No further jobs with this account code can start until the number running has fallen below this limit. As the total falls below this limit, Slurm will automatically start new jobs from your PENDING list until the limit is reached again.
CPU 512 2048 The total number of CPU cores that all RUNNING jobs using this account code / project are using at any point in time. If this limit is reached then any further jobs will remain PENDING until the total falls below this limit.
Nodes Unlimited Unlimited The total number of nodes allocated to all of your RUNNING jobs using this account code / project. This includes both entire-nodes explicitly requested, plus any nodes dynamically assigned to individual jobs. Currently not enforced - set to unlimited.
GPU 1 8 The total number of GPU cards that can be used by any RUNNING jobs using this account code / project. If this limit is reached then any further jobs will remain PENDING until the total falls below this limit.
GPU RAM Unlimited Unlimited The total amount of allocated GPU RAM that can be used by any RUNNING jobs using this account code / project. If this limit is reached then any further jobs will remain PENDING until the total falls below this limit. Not currently enforced.
RAM Unlimited Unlimited The total amount of allocated RAM that can be used by any RUNNING jobs using this account code / project. If this limit is reached then any further jobs will remain PENDING until the total falls below this limit. Not currently enforced.
Local Disk Unlimited Unlimited The total amount of local disk space on compute nodes (i.e. /tmp or /scratch) that can be allocated by any RUNNING jobs using this account code / project. If this limit is reached then any further jobs will remain PENDING until the total falls below this limit. Not currently enforced.

It is important to note that these limits are applied at the account code (aka project) level. All members of a project submitting jobs with the same account code will contribute towards the totals.

As per How can I tell which account codes I can use and which partitions I can submit to?, you can check the limits for your project as follows (substitute comet_training with one of your real project account codes):

$ sacctmgr list associations where account=comet_training format=Account%20,User%16,Partition%24,GrpTRES%30,MaxSubmitJobs,MaxJobs
             Account             User                Partition                        GrpTRES MaxSubmit MaxJobs 
-------------------- ---------------- ------------------------ ------------------------------ --------- ------- 
      comet_training                                                      cpu=2048,gres/gpu=8       512    1024
      comet_training           n12345             default_free                                      512    1024 
      comet_training           n12345             default_paid                                      512    1024 
      comet_training           n12345               gpu-l_paid                                      512    1024 
      comet_training           n12345               gpu-s_free                                      512    1024 
      comet_training           n12345               gpu-s_paid                                      512    1024 
...
...
...
$

You can see in this case that the project has the upper, funded limits applied (2048 CPU cores, 8 GPU cards), and that there are no current limits on Disk, RAM, Nodes or GPU RAM (i.e. unlimited).

To monitor what your project (and project members) are currently using, you can use this variation of the squeue command:

$ squeue -A comet_1234 --format="%.9A %.16a %.16u %.16j %.4C %.11M %.11L %.24N %r"
   JOBID          ACCOUNT             USER             NAME      CPUS        TIME   TIME_LEFT                 NODELIST REASON
  1198357       comet_1234         n12345            job1.sh      512        0:00  2-00:00:00                          AssocGrpCpuLimit
  1198357       comet_1234         n23456            job2.sh      512        0:00  2-00:00:00                          AssocGrpCpuLimit
  1198112       comet_1234         n12345            task2.sh     768  1-01:40:10    22:19:50            hmem[004-006] None
  1198527       comet_1234         n23456            run.sh       256     1:30:00    10:30:00             ibcompute004 None
  1198133       comet_1234         n12345            compute.sh   512  1-01:26:10    22:33:50         compute[026-027] None

This shows that two jobs (1198357 and 1198357) are currently held in PENDING state as the account CPU limit has been reached. As jobs 1198112, 1198527 and 1198133 finish these pending jobs should be started automatically by Slurm. No further action is needed.

If you have a use case which needs resources beyond the limits of these figures please get in touch to discuss the options.


Back to Getting Started