If you are using MPI jobs then you may immediately think that the low latency nodes are what you should be using.
Instinctively this would appear to be true, however the truth is a little more nuanced.
If your MPI job fits entirely within the resource constraints of a single node, then you will get better performance by restricting the Slurm job to a single compute node, rather than distributing it across multiple physical nodes. The speed of process to process messaging within the same node is far higher than having to go out from the node and send a message over the (admittedly very fast) Infiniband connection to another node.
If you are running MPI jobs which count in the tens or dozens of cores range, than you are better off running those jobs on the default_paid partition, on a single node, as the job can be run entirely within the resources of that single machine.
Only if your job starts needing to run hundreds of cores, should you move to the low-latency_paid partition.
Back to FAQ index
Table of Contents
Main Content Sections
Documentation Tools