There are a myriad of ways of capturing output from running Slurm tasks. Normally all output, regardless of number of parallel tasks is captured by the main Slurm output file.
Below are three additional options to alter your Slurm job script, to capture srun output on a per parallel-task basis.
srun
Each task that is started by srun is given its own log file to capture any output it produces. The name of the log file will be mytask.$HOSTNAME.$SLURM_ARRAY_TASK_ID.log, where $HOSTNAME is the name of the node that Slurm started the job on, and $SLURM_ARRAY_TASK_ID is the number of the parallel task which Slurm has started.
$HOSTNAME
$SLURM_ARRAY_TASK_ID
e.g. mytask.node044.1.log
mytask.node044.1.log
rm -f mytask.*.log srun COMMAND >mytask.${HOSTNAME}.${SLURM_ARRAY_TASK_ID}.log
Any errors produced by the srun task will still be captured by the main Slurm output file.
In addition, any error output produced by the srun task will also be captured in the same log file as normal output, i.e. the file named above.
rm -f mytask.*.log srun COMMAND 2>&1 >task.${HOSTNAME}.${SLURM_ARRAY_TASK_ID}.log
Any errors produced by the srun task will be captured in a log file of the same name, but with the .err suffix.
.err
e.g. mytask.node044.1.err
mytask.node044.1.err
rm -f mytask.*.log rm -f mytask.*.err srun COMMAND >task.${HOSTNAME}.${SLURM_ARRAY_TASK_ID}.log 2>task.${HOSTNAME}.${SLURM_ARRAY_TASK_ID}.err
Back to FAQ index
Table of Contents
Main Content Sections
Documentation Tools