There are a myriad of ways of capturing output from running Slurm tasks. Normally all output, regardless of number of parallel tasks is captured by the main Slurm output file.
Below are three additional options to alter your Slurm job script, to capture srun
output on a per parallel-task basis.
Each task that is started by srun
is given its own log file to capture any output it produces. The name of the log file will be mytask.$HOSTNAME.$SLURM_ARRAY_TASK_ID.log, where $HOSTNAME
is the name of the node that Slurm started the job on, and $SLURM_ARRAY_TASK_ID
is the number of the parallel task which Slurm has started.
e.g. mytask.node044.1.log
rm -f mytask.*.log
srun COMMAND >mytask.${HOSTNAME}.${SLURM_ARRAY_TASK_ID}.log
Any errors produced by the srun task will still be captured by the main Slurm output file.
Each task that is started by srun
is given its own log file to capture any output it produces. The name of the log file will be mytask.$HOSTNAME.$SLURM_ARRAY_TASK_ID.log, where $HOSTNAME
is the name of the node that Slurm started the job on, and $SLURM_ARRAY_TASK_ID
is the number of the parallel task which Slurm has started.
e.g. mytask.node044.1.log
In addition, any error output produced by the srun task will also be captured in the same log file as normal output, i.e. the file named above.
rm -f mytask.*.log
srun COMMAND 2>&1 >task.${HOSTNAME}.${SLURM_ARRAY_TASK_ID}.log
Each task that is started by srun
is given its own log file to capture any output it produces. The name of the log file will be mytask.$HOSTNAME.$SLURM_ARRAY_TASK_ID.log, where $HOSTNAME
is the name of the node that Slurm started the job on, and $SLURM_ARRAY_TASK_ID
is the number of the parallel task which Slurm has started.
e.g. mytask.node044.1.log
Any errors produced by the srun task will be captured in a log file of the same name, but with the .err
suffix.
e.g. mytask.node044.1.err
rm -f mytask.*.log
rm -f mytask.*.err
srun COMMAND >task.${HOSTNAME}.${SLURM_ARRAY_TASK_ID}.log 2>task.${HOSTNAME}.${SLURM_ARRAY_TASK_ID}.err