This is a very simple example of how to run an application/script/code/analysis in parallel on multiple input files.
For a more complete worked example, consider reading through our Advanced Slurm Job Optimisation guide.
Where you have multiple sets of data to process, and they can all be processed using the same commands, then it is beneficial to ask Slurm to automate this for us.
Instead of submitting 4 jobs, each taking 1 hour each to process a data file, we can submit one job to run all 4 at the same time.
–partition=default_free
–ntasks-per-node=4
–cpus-per-task=1
ntasks_per_node * cpus_per_task
–mem=1G
–time=01:00:00
(ntasks_per_node * cpus_per_task) * time_in_hours
Of the single-node Slurm job types, the Task Array job type is the more complex, and the job script needs a little more explanation than the previous types, but has the advantage that you do not need to change anything in your existing application/code/script - if it can already process a named data/input file, then it will work in parallel via the Task Array without any changes.
#!/bin/bash #SBATCH --account=myhpcproject #SBATCH --partition=default_free #SBATCH --nodes=1 #SBATCH --ntasks-per-node=4 #SBATCH --cpus-per-task=1 #SBATCH --mem=1G #SBATCH --time=01:00:00 # Log when we started echo "Job started at: `date`" # Add your custom commands you want to run here my_command input_file.data.data.${SLURM_ARRAY_TASK_ID} > output.log.${SLURM_ARRAY_TASK_ID} # Log when we finished echo "Job finished at: `date`"
Notice that we have used a new Slurm variable; SLURM_ARRAY_TASK_ID. This is a unique variable which recieves the TASK_ID of each task launched by Slurm. Remember that we have asked for 4 tasks (–ntasks-per-node=4), so each of those tasks which Slurm launches for us will turn the variable $SLURM_ARRAY_TASK_ID into its own Task ID. e.g.
SLURM_ARRAY_TASK_ID
Task #1 will see SLURM_ARRAY_TASK_ID = 1 Task #2 will see SLURM_ARRAY_TASK_ID = 2 Task #3 will see SLURM_ARRAY_TASK_ID = 3 Task #4 will see SLURM_ARRAY_TASK_ID = 4
This allows each task to run the same commands on a different set of data, as along as we name the input data files with the numbers of the Task ID's we expect (in this case; 1 through 4), we can set up a directory of data files to be processed all at the same time.
In the case above, assuming we had a command named my_command which processes some data, we will create a directory of input data files named:
my_command
input_file.data.1
input_file.data.2
input_file.data.3
input_file.data.4
Slurm will launch the job script and then each one of those files would be processed, in parallel, by my_command. In effect, Slurm does this for us:
my_command input_file.data.1 > output.log.1 my_command input_file.data.2 > output.log.2 my_command input_file.data.3 > output.log.3 my_command input_file.data.4 > output.log.4
But, all at the same time.
This is a very simple way of processing huge numbers of data files with exactly the same code/application/script at the same time, but it is limited to the maximum number of running jobs that your project is allowed (see the MaxJobs value in the Comet Resource Limits - what they mean guide) and requires some planning before hand to structure your data.
A more complete, worked example can be found in the Advanced Slurm Job Optimisation article.
Back to Getting Started
Table of Contents
HPC Service
Main Content Sections
Documentation Tools