====== Task Array Slurm Job ======

<WRAP tip round box>
This is a __very__ simple example of how to run an application/script/code/analysis in parallel on multiple input files.

For a more complete worked example, consider reading through our [[advanced:slurm|Advanced Slurm Job Optimisation]] guide.
</WRAP>

Where you have multiple sets of data to process, and they can all be processed using the same commands, then it is beneficial to ask Slurm to automate this for us.

Instead of submitting 4 jobs, each taking 1 hour each to process a data file, we can submit one job to run all 4 at the same time.

   * Uses the HPC Project group **myhpcproject**; change to use your //real// HPC Project name
   * Submitted to the //free// **default_free** partition (''--partition=default_free'')
   * Requests **4** parallel tasks (''--ntasks-per-node=4'')
   * Requests **1 CPU** per task (''--cpus-per-task=1''), for a total allocation of **4 CPU** cores (''ntasks_per_node * cpus_per_task'')
   * Requests **1GB** of RAM (''--mem=1G'')
   * Requests up to **1 hour** of runtime (''--time=01:00:00''), for a maximum //possible// total of **4 Compute Hours** (''(ntasks_per_node * cpus_per_task) * time_in_hours'')
   * Prints the time the job started to the log file
   * Prints the name of the compute node(s) it will run on
   * Prints the time the job finished to the log file

Of the single-node Slurm job types, the //Task Array// job type is the more complex, and the job script needs a little more explanation than the previous types, but has the advantage that you do not need to change anything in your existing application/code/script - if it can already process a named data/input file, then it will work in parallel via the //Task Array// __without__ any changes.

<code lang=bash title=slurm_sequential.sh>
#!/bin/bash

#SBATCH --account=myhpcproject
#SBATCH --partition=default_free
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=1
#SBATCH --mem=1G
#SBATCH --time=01:00:00

# Log when we started
echo "Job started at: `date`"

# Add your custom commands you want to run here
my_command input_file.data.data.${SLURM_ARRAY_TASK_ID} > output.log.${SLURM_ARRAY_TASK_ID}

# Log when we finished
echo "Job finished at: `date`"
</code>

Notice that we have used a new Slurm variable; ''SLURM_ARRAY_TASK_ID''. This is a unique variable which recieves the TASK_ID of each task launched by Slurm. Remember that we have asked for **4** tasks (''--ntasks-per-node=4''), so each of those tasks which Slurm launches for us will turn the variable $SLURM_ARRAY_TASK_ID into its own Task ID. e.g.

<code>
Task #1 will see SLURM_ARRAY_TASK_ID = 1
Task #2 will see SLURM_ARRAY_TASK_ID = 2
Task #3 will see SLURM_ARRAY_TASK_ID = 3
Task #4 will see SLURM_ARRAY_TASK_ID = 4
</code>

This allows each task to run the same commands on a different set of data, as along as we name the input data files with the numbers of the Task ID's we expect (in this case; 1 through 4), we can set up a directory of data files to be processed all at the same time.

In the case above, assuming we had a command named ''my_command'' which processes some data, we will create a directory of input data files named:

   * ''input_file.data.1''
   * ''input_file.data.2''
   * ''input_file.data.3''
   * ''input_file.data.4''

Slurm will launch the job script and then each one of those files would be processed, in parallel, by ''my_command''. In effect, Slurm does this for us:

<code>
my_command input_file.data.1 > output.log.1
my_command input_file.data.2 > output.log.2
my_command input_file.data.3 > output.log.3
my_command input_file.data.4 > output.log.4
</code>

But, __all at the same time__.

This is a very simple way of processing huge numbers of data files with exactly the same code/application/script at the same time, but it is limited to the maximum number of running jobs that your project is allowed (see the **MaxJobs** value in the [[started:comet_resources#resource_limits|Comet Resource Limits - what they mean]] guide) and requires some planning before hand to structure your data.

A more complete, worked example can be found in the [[advanced:slurm|Advanced Slurm Job Optimisation]] article.

----

[[started:index|Back to Getting Started]]