Table of Contents

Bioapps Container

This container is intended to collect most of the commonly used Bioinformatics software packages used on Rocket and Comet, and provide them in a single, easy to use format, without all of the complexity of many module load and module unload commands.

Jump straight to the Bioapps container software lists:


Why A Container?

With almost limitless combinations of bioinformatics tools that can be used together it is very difficult to ensure that any given set of software modules which have been provisioned on Comet can be used alongside any other set of modules.

Multiple versions of Python, C compilers and runtimes, and dependencies mean that almost every single unique set of software which is intended to be used together needs to be validated and tested to make sure that no modules conflict - bwa may need version X of a runtime, but samtools could require version Y of the same runtime, and therefore it is impossible (or at least strongly inadvisable due to unpredictable behaviour in such a scenario) to use both tools in the same pipeline at once. As more and more modules are added, this becomes a combinatorial explosion of software and versions problem… which some of our users have already experienced.

By building all of the common bioinformatics tools in one container, with one C compiler, one version of Python, and one set of their dependencies we can guarantee that this set of software will work without conflicting with each other - and the unknown side affects of such version conflicts.

It also means we can use one set of tools on a local workstation or HPC without changing our workflow.


Running on Comet

The Bioapps container is stored in the /nobackup/shared/containers directory and is accessible to all users of Comet. You do not need to take a copy of the container file; it should be left in its original location.

You can find the container files here:

We normally recommend using the latest (date) version of the container.

Container Image Versions

We may reference a specific container file, such as bioapps.2026.03.sif, but you should always check whether this is the most recent version of the container available. Simply ls the /nobackup/shared/containers directory and you will be able to see if there are any newer versions listed.

We have provided a convenience script that will automate all of steps needed to run applications inside the container, and access your $HOME, /scratch and /nobackup directories to just two simple commands.

There is a corresponding .sh script for each version of the container image we make available.

Just source this file and it will take care of loading apptainer, setting up your bind directories and calling the exec command for you - and give you a single command called container.run (instead of the really long apptainer exec command) to then run anything you want inside the container, for example - to run our bowtie2 command as before:

$ source /nobackup/shared/containers/bioapps.2026.03.sh
$ container.run bowtie2 -U /nobackup/proj/my_project/fastq_data/file1.fq

You can continue to use the container.run command as many times as you need in the same script or same bash session:

$ source /nobackup/shared/containers/bioapps.2026.03.sh
$ container.run bowtie2 -U /nobackup/proj/my_project/fastq_data/file1.fq
$ container.run samtools
$ container.run bwameth
$ container.run hisat2 --version
$ container.run python3
$ container.run R

We strongly recommend that you use this helper script and the container.run command to run software from inside the container as it will always ensure that you have correctly set up the bind directories for you and you are using the correct container version.


Individual Software Help

All of the following software packages are available within the Bio apps container. Please check the Included From field to see which version of the container the software was introduced from.

Title Included from Files Source Link Description
bamutil 2026.02+ bam https://github.com/statgen/bamUtil/
bcftools 2026.02+ bcftools https://github.com/samtools/bcftools The plugins (e.g. counts.so, contrast.so, prune.so, etc) for bcftool are installed under /opt/libexec/bcftools and will be used automatically.
bcl_convert (1) 2026.02+ bcl-convert https://emea.support.illumina.com/sequencing/sequencing_software/bcl-convert/downloads.html Note that the use of bcl-convert is subject to the following licensing restrictions:

- The software can only be used “… for the purpose of processing and analyzing data generated from an Illumina genetic sequencing instrument owned and operated solely by (the University)”

- The software is only to be used for research purposes

- The software can only be used with data generated from the Illumina instrument, and not any data generated from other sources
bowtie2 2026.02+ bowtie2
bowtie2-align-l
bowtie2-align-s
bowtie2-build-l
bowtie2-build-s
bowtie2-inspect-l
bowtie2-inspect-s
https://github.com/BenLangmead/bowtie2
bwa 2026.02+ bwa
qualfa2fq.pl
xa2multi.pl
https://github.com/lh3/bwa
bwa-mem2 2026.02+ bwa-mem2
bwa-mem2.avx
bwa-mem2.avx2
bwa-mem2.avx512bw
bwa-mem2.sse41
bwa-mem2.sse42
https://github.com/bwa-mem2/bwa-mem2 Calling bwa-mem2 will automatically select the most optimal version (e.g. avx, avx2 etc).
bwa-meth 2026.02+ bwameth.py
bwameth
https://github.com/brentp/bwa-meth The file bwameth is provided as a symbolic link to bwameth.py - you do not need to call it via Python, just bwameth is enough.
samtools 2026.02+ samtools https://github.com/samtools/samtools
sambamba 2026.02+ sambamba https://github.com/biod/sambamba
seqkit 2026.02+ seqkit https://github.com/shenwei356/seqkit
methyldackel 2026.02+ MethylDackel
methyldackel
https://github.com/dpryan79/MethylDackel The file methyldackel is provided as a symbolic link to MethylDackel for convenience / simplification of capitalisation.
minimap2 2026.02+ minimap2 https://github.com/lh3/minimap2
bedtools2 2026.02+ annotateBed
bamToBed
bamToFastq
bed12ToBed6
bedToBam
bedToIgv
bedpeToBam
bedtools
closestBed
clusterBed
complementBed
coverageBed
expandCols
fastaFromBed
flankBed
genomeCoverageBed
getOverlap
groupBy
intersectBed
linksBed
mapBed
maskFastaFromBed
mergeBed
multiBamCov
multiIntersectBed
nucBed
pairToBed
pairToPair
randomBed
shiftBed
shuffleBed
slopBed
sortBed
subtractBed
tagBam
unionBedGraphs
windowBed
windowMaker
https://github.com/arq5x/bedtools2
bam-readcount 2026.02+ bam-readcount https://github.com/genome/bam-readcount
hisat2 2026.02+ hisat2
hisat2-align-l
hisat2-align-s
hisat2-inspect
hisat2-inspect-l
hisat2-inspect-s
hisat2-repeat
hisat2_extract_exons.py
hisat2_extract_snps_haplotypes_UCSC.py
hisat2_extract_snps_haplotypes_VCF.py
hisat2_extract_splice_sites.py
hisat2_read_statistics.py
hisat2_simulate_reads.py
extract_exons.py
extract_splice_sites.py
https://cloud.biohpc.swmed.edu/index.php
stringtie 2026.02+ stringtie
prepDE.py
http://ccb.jhu.edu/software/stringtie/dl
gffcompare 2026.02+ gffcompare
trmap
http://ccb.jhu.edu/software/stringtie/dl
htseq-count 2026.02+ htseq-count
htseq-count-barcodes
htseq-qa
https://pypi.org/project/HTSeq/ The HTSeq installer places all of the files under /usr/local/bin, but this is also added to the $PATH.
picard 2026.02+ picard.jar
shell alias picard
https://github.com/broadinstitute/picard/releases/download The shell alias alias picard=“java -jar /opt/bin/picard.jar” is provided to allow you to run this with the single command picard.
seqan-library 2026.02+ alf
bam2roi
dfi
fx_bam_coverage
fx_fastq_stats
gustaf
gustaf_mate_joining
insegt
mason_frag_sequencing
mason_genome
mason_materializer
mason_methylation
mason_simulator
mason_splicing
mason_tests
mason_variator
micro_razers
pair_align
param_chooser
rabema_build_gold_standard
rabema_do_search
rabema_evaluate
rabema_prepare_sam
razers
razers3
razers3_quality2prob
razers3_simulate_reads
rep_sep
roi_feature_projection
roi_plot_thumbnails
s4_join
s4_search
sak
sam2matrix
samcat
seqan_tcoffee
seqcons2
sgip
splazers
stellar
tree_recon
yara_indexer
yara_mapper
https://github.com/seqan/seqan These are the sample tools provided with the seqn-library installation. We have not installed all of the test_ and demo_ files.
regtools 2026.02+ regtools https://github.com/griffithlab/regtools
rseqc 2026.02+ All scripts as listed here https://rseqc.sourceforge.net The RSeQC installer places all files in /usr/local/bin by default - this is added to $PATH so you should still be able to call bam2fq.py without giving the full path, for example.
Python 2026.02+ python3 Currently uses version Python 3.12
R 2026.03+ R, Rscript Currently version 4.5.2
GCC 2026.02+ gcc-14, g++-14, gfortran-14 Please use the existing CFLAGS, CXXFLAGS and CPPFLAGS environment variables which were set during the installation of the container; this will ensure the most appropriate performance optimisation flags are retained for any additional software you compile.

Note: All binaries are compiled for the AMD Epyc CPU architecture of Comet with the CFLAGS=-O3 -march=znver5 -pipe flags with GCC 14, on top of any existing optimisation flags set by each application. All binaries are also stripped of debugging symbols with strip -g to reduce their on-disk and in-memory size requirements.

(1) - bcl-convert is a vendor provided binary (proprietary to Illumina) and unlike all other listed software has not been recompiled for Comet.


Python Modules

In addition to the standalone applications listed above, the following Python modules are installed (i.e. they are available if you use python3 installed from the container with a normal import module syntax in your code):

This list only includes modules which have been explicitly installed. The standard Python built-ins are still available: sqlite, json, etc.


R Libraries

In addition to the standalone applications listed above, the following R libraries are installed (i.e. they are available to use in the R and Rscript commands started from the container, and by using the normal library(module) syntax in your code):

Library Name Available From Link
AnnotationDbi 2026.03+ https://bioconductor.org/packages/release/bioc/html/AnnotationDbi.html
BH 2026.03+ https://cran.r-project.org/web/packages/BH/index.html
Biobase 2026.03+ https://bioconductor.org/packages/release/bioc/html/Biobase.html
BiocFileCache 2026.03+ https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html
BiocGenerics 2026.03+ https://bioconductor.org/packages/release/bioc/html/BiocGenerics.html
BiocManager 2026.03+ https://github.com/Bioconductor/BiocManager
BiocParallel 2026.03+ https://www.bioconductor.org/packages/release/bioc/html/BiocParallel.html
BiocVersion 2026.03+ https://www.bioconductor.org/packages/release/bioc/html/BiocVersion.html
Biostrings 2026.03+ https://www.bioconductor.org/packages/release/bioc/html/Biostrings.html
DBI 2026.03+ https://cran.r-project.org/web/packages/DBI/index.html
DESeq2 2026.03+ https://bioconductor.org/packages/release/bioc/html/DESeq2.html
DEXSeq 2026.03+ https://bioconductor.org/packages/release/bioc/html/DEXSeq.html
DelayedArray 2026.03+ https://bioconductor.org/packages/release/bioc/html/DelayedArray.html
GenomicRanges 2026.03+ https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html
IRanges 2026.03+ https://bioconductor.org/packages/release/bioc/html/IRanges.html
KEGGREST 2026.03+ https://www.bioconductor.org/packages/release/bioc/html/KEGGREST.html
MatrixGenerics 2026.03+ https://bioconductor.org/packages/devel/bioc/html/MatrixGenerics.html
R6 2026.03+ https://cran.r-project.org/web/packages/R6/index.html
RColorBrewer 2026.03+ https://cran.r-project.org/web/packages/RColorBrewer/index.html
RSQLite 2026.03+ https://cran.r-project.org/web/packages/RSQLite/index.html
Rcpp 2026.03+ https://cran.r-project.org/web/packages/Rcpp/index.html
RcppArmadillo 2026.03+ https://cran.r-project.org/web/packages/RcppArmadillo/index.html
Rhtslib 2026.03+ https://bioconductor.org/packages/release/bioc/html/Rhtslib.html
Rsamtools 2026.03+ https://bioconductor.org/packages/release/bioc/html/Rsamtools.html
S4Arrays 2026.03+ https://bioconductor.org/packages/release/bioc/html/S4Arrays.html
S4Vectors 2026.03+ https://bioconductor.org/packages/release/bioc/html/S4Vectors.html
S7 2026.03+ https://cran.r-project.org/web/packages/S7/index.html
Seqinfo 2026.03+ https://bioconductor.org/packages/release/bioc/html/Seqinfo.html
SparseArray 2026.03+ https://bioconductor.org/packages/release/bioc/html/SparseArray.html
SummarizedExperiment 2026.03+ https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html
XML 2026.03+ https://cran.r-project.org/web/packages/XML/index.html
XVector 2026.03+ https://bioconductor.org/packages/release/bioc/html/XVector.html
abind 2026.03+ https://cran.r-project.org/web/packages/abind/index.html
annotate 2026.03+ https://www.bioconductor.org/packages/release/bioc/html/annotate.html
askpass 2026.03+ https://cran.r-project.org/web/packages/askpass/index.html
biomaRt 2026.03+ https://bioconductor.org/packages/release/bioc/html/biomaRt.html
bit 2026.03+ https://cran.r-project.org/web/packages/bit/index.html
bit64 2026.03+ https://cran.r-project.org/web/packages/bit64/index.html
bitops 2026.03+ https://cran.r-project.org/web/packages/bitops/index.html
blob 2026.03+ https://cran.r-project.org/web/packages/blob/index.html
cachem 2026.03+ https://cran.r-project.org/web/packages/cachem/index.html
cli 2026.03+ https://cran.r-project.org/web/packages/cli/index.html
cpp11 2026.03+ https://cran.r-project.org/web/packages/cpp11/index.html
crayon 2026.03+ https://cran.r-project.org/web/packages/crayon/index.html
curl 2026.03+ https://cran.r-project.org/web/packages/curl/index.html
dbplyr 2026.03+ https://cran.r-project.org/web/packages/dbplyr/index.html
dplyr 2026.03+ https://cran.r-project.org/web/packages/dplyr/index.html
farver 2026.03+ https://cran.r-project.org/web/packages/farver/index.html
fastmap 2026.03+ https://cran.r-project.org/web/packages/fastmap/index.html
filelock 2026.03+ https://cran.r-project.org/web/packages/filelock/index.html
formatR 2026.03+ https://cran.r-project.org/web/packages/formatR/index.html
futile.logger 2026.03+ https://cran.r-project.org/web/packages/futile.logger/index.html
futile.options 2026.03+ https://cran.r-project.org/web/packages/futile.options/index.html
genefilter 2026.03+ https://www.bioconductor.org/packages/release/bioc/html/genefilter.html
geneplotter 2026.03+ https://www.bioconductor.org/packages/release/bioc/html/geneplotter.html
generics 2026.03+ https://cran.r-project.org/web/packages/generics/index.html
ggplot2 2026.03+ https://cran.r-project.org/web/packages/ggplot2/index.html
glue 2026.03+ https://cran.r-project.org/web/packages/glue/index.html
gtable 2026.03+ https://cran.r-project.org/web/packages/gtable/index.html
hms 2026.03+ https://cran.r-project.org/web/packages/hms/index.html
httr 2026.03+ https://cran.r-project.org/web/packages/httr/index.html
httr2 2026.03+ https://cran.r-project.org/web/packages/httr2/index.html
hwriter 2026.03+ https://cran.r-project.org/web/packages/hwriter/index.html
isoband 2026.03+ https://cran.r-project.org/web/packages/isoband/index.html
jsonlite 2026.03+ https://cran.r-project.org/web/packages/jsonlite/index.html
labeling 2026.03+ https://cran.r-project.org/web/packages/labeling/index.html
lambda.r 2026.03+ https://cran.r-project.org/web/packages/lambda.r/index.html
lifecycle 2026.03+ https://cran.r-project.org/web/packages/lifecycle/index.html
locfit 2026.03+ https://cran.r-project.org/web/packages/locfit/index.html
magrittr 2026.03+ https://cran.r-project.org/web/packages/magrittr/index.html
matrixStats 2026.03+ https://cran.r-project.org/web/packages/matrixStats/index.html
memoise 2026.03+ https://cran.r-project.org/web/packages/memoise/index.html
mime 2026.03+ https://cran.r-project.org/web/packages/mime/index.html
openssl 2026.03+ https://cran.r-project.org/web/packages/openssl/index.html
pillar 2026.03+ https://cran.r-project.org/web/packages/pillar/index.html
pkgconfig 2026.03+ https://cran.r-project.org/web/packages/pkgconfig/index.html
png 2026.03+ https://cran.r-project.org/web/packages/png/index.html
prettyunits 2026.03+ https://cran.r-project.org/web/packages/prettyunits/index.html
progress 2026.03+ https://cran.r-project.org/web/packages/progress/index.html
purrr 2026.03+ https://cran.r-project.org/web/packages/purrr/index.html
rappdirs 2026.03+ https://cran.r-project.org/web/packages/rappdirs/index.html
rlang 2026.03+ https://cran.r-project.org/web/packages/rlang/index.html
scales 2026.03+ https://cran.r-project.org/web/packages/scales/index.html
snow 2026.03+ https://cran.r-project.org/web/packages/snow/index.html
statmod 2026.03+ https://cran.r-project.org/web/packages/statmod/index.html
stringi 2026.03+ https://cran.r-project.org/web/packages/stringi/index.html
stringr 2026.03+ https://cran.r-project.org/web/packages/stringr/index.html
sys 2026.03+ https://cran.r-project.org/web/packages/sys/index.html
tibble 2026.03+ https://cran.r-project.org/web/packages/tibble/index.html
tidyr 2026.03+ https://cran.r-project.org/web/packages/tidyr/index.html
tidyselect 2026.03+ https://cran.r-project.org/web/packages/tidyselect/index.html
utf8 2026.03+ https://cran.r-project.org/web/packages/utf8/index.html
vctrs 2026.03+ https://cran.r-project.org/web/packages/vctrs/index.html
viridisLite 2026.03+ https://cran.r-project.org/web/packages/viridisLite/index.html
withr 2026.03+ https://cran.r-project.org/web/packages/withr/index.html
xml2 2026.03+ https://cran.r-project.org/web/packages/xml2/index.html
xtable 2026.03+ https://cran.r-project.org/web/packages/xtable/index.html

This list only includes the R libaries which have been explicitly installed, or brought in as dependencies by other libraries. The standard R libraries are still available: base, splines, stats, utils, etc.


Building Bioapps Container

Important

This section is only relevant to RSE HPC staff or users wanting to understand how the container image is built. If you are intending to simply use the software you do not need to read this section - turn back now!

Build script:

#!/bin/bash

IMAGE_DATE=`date +%Y.%m`

echo "Loading modules..."
module load apptainer

echo ""
echo "Building container..."
export APPTAINER_TMPDIR=/scratch

echo ""
echo "Container will have date suffix $IMAGE_DATE"


# You must supply a copy of bc-convert*.rpm in this
# folder below. If it is not present then the install
# of this tool will be skipped.
SOURCE_DIR=`pwd`

BCL_RPM="bcl-convert-4.4.6-2.el8.x86_64.rpm"

echo ""
echo "Checking source files..."
if [ -s "$SOURCE_DIR/$BCL_RPM" ]
then
	echo "- Found - $SOURCE_DIR/$BCL_RPM"
else
	echo "- WARNING - $SOURCE_DIR/$BCL_RPM is MISSING"
	echo ""
	echo "Press return to continue or Control+C to exit and fix"
	read	
fi


apptainer build --bind $SOURCE_DIR:/mnt bioapps.$IMAGE_DATE.sif bioapps.def 2>&1 | tee bioapps.log

Container definition:

Bootstrap: docker
From: ubuntu:noble

####################################################################
#
# Bio apps container
# ==================
# This is a runtime environment for a large set of bioinformatics tools.
# Please see: 
#	https://hpc.researchcomputing.ncl.ac.uk/dokuwiki/dokuwiki/doku.php?id=advanced:software:bioapps
#
# ======================================
#
# NAME : WORKING
#	LINK
#
# ======================================
# bamutil : Yes
#	https://github.com/statgen/bamUtil/
#
# bcftools :
#	https://github.com/samtools/bcftools
#
# bowtie2 : Yes
#	https://github.com/BenLangmead/bowtie2
#
# Bwa : Yes
#	https://github.com/lh3/bwa
#
# Bwa-mem2 : Yes
#	https://github.com/bwa-mem2/bwa-mem2/releases/tag/v2.3
#
# Bwa-meth : Yes
#	https://github.com/brentp/bwa-meth
#
# Samtools : Yes
#	https://github.com/samtools/samtools
#
# Sambamba : Yes
#	https://github.com/biod/sambamba
#
# Seqkit : Yes
#	https://github.com/shenwei356/seqkit
#
# methyldackel : Yes
#	https://github.com/dpryan79/MethylDackel
#
# minimap : Yes
#	https://github.com/lh3/minimap2
#
# bedtools2 : Yes
#	https://github.com/arq5x/bedtools2/archive/refs/heads/master.zip
#
# bam-readcount : Yes
#	https://github.com/genome/bam-readcount
#
# hisat2 : Yes
#	https://cloud.biohpc.swmed.edu/index.php/s/fE9QCsX3NH4QwBi/download
#
# StringTie : Yes
#	http://ccb.jhu.edu/software/stringtie/dl
#
# gffcompare : Yes
#	http://ccb.jhu.edu/software/stringtie/dl
#
# htseq-count : Yes
#	https://pypi.python.org/packages/source/H/HTSeq
#
# picard : Yes
#	https://github.com/broadinstitute/picard/releases/download
#
# seqan-library : Yes
#	https://github.com/seqan/seqan
#
# regtools : Yes
#	https://github.com/griffithlab/regtools
#
# RSeQC : Yes
#	https://rseqc.sourceforge.net/#download-rseqc
#	
####################################################################

%post
    # Prevent interactive prompts
    export DEBIAN_FRONTEND=noninteractive

####################################################################
#
# Basic system packages
#
####################################################################

    # Update & install only necessary packages
    apt-get update
	apt-get install -y apt-utils wget autoconf cmake rpm2cpio cpio build-essential man-db tar unzip git aptitude golang-go python3-pip gcc-14 g++-14 gfortran-14 openmpi-bin openmpi-common libopenmpi-dev libgomp1 autoconf vim libhts-dev libncurses-dev libbz2-dev liblz4-dev openjdk-25-jre libbigwig-dev libgsl-dev libxml2-dev libssl-dev libpng-dev liblapack-dev
	ln -s /usr/bin/python3 /usr/bin/python
	
    # Clean up APT cache to save space
    apt-get clean 

	# Any Python modules installed via pip go here
	# pip install NAME --break-system-packages
	
	# Remove any Python cache files after pip
	pip3 cache purge

#################################################################################
#
# This is all the custom stuff needed to build the various bioinformatics tools
#
#################################################################################

	# This flag needs to be set to indicate which CPU architecture we
	# are optimising for.
	AMD_ARCH=1

	if [ "$AMD_ARCH" = "1" ]
	then
		# Compiling on AMD Epyc
		export BASE_CFLAGS="-O3 -march=znver5 -pipe"
		export BASE_CFLAGS_ALT="-O3 -march=native -pipe"
		export MAKE_JOBS=8
	else
		# Compiling on generic system
		export BASE_CFLAGS="-O"
		export BASE_CFLAGS_ALT="-O"
		export MAKE_JOBS=8
	fi
	
	export CPPFLAGS=""
	export CFLAGS="$BASE_CFLAGS -I/opt/include"
	export CFLAGS_ALT="$BASE_CFLAGS_ALT -I/opt/include"
	export CXXFLAGS="$CFLAGS"
	export CC=gcc-14
	export CXX=g++-14
	export FC=gfortran-14
	export LD_LIBRARY_PATH=/opt/lib:$LD_LIBRARY_PATH
	export PATH=/opt/bin:$PATH

	###############################################################################
	# Tell R to use the newer version of GCC when it needs to compile.
	# R 'helpfully' ignores standard CC/CFLAG/etc environment variables and
	# uses its own mechanism for setting the C/C++ and optimisation flags to
	# use. Override those by writing /root/.R/Makevars instead.
	###############################################################################
	mkdir -p /root/.R/
	echo "CC=$CC" > /root/.R/Makevars
	echo "CXX=$CXX" >> /root/.R/Makevars
	echo "CFLAGS=$CFLAGS" >> /root/.R/Makevars
	echo "CXXFLAGS=$CFLAGS" >> /root/.R/Makevars
	echo "CMAKE_C_COMPILER=$CC" >> /root/.R/Makevars
	echo "CMAKE_CXX_COMPILER=$CXX" >> /root/.R/Makevars
	echo "F77=$FC" >> /root/.R/Makevars

	echo ""
	echo "Post-OS-install setup for Bio apps container"
	echo "============================================"

	# A download place for external libraries
	mkdir -p /src/zipped
	
	# Where installations go
	mkdir -p /opt/bin
	mkdir -p /opt/include
	mkdir -p /opt/lib
	mkdir -p /opt/man
	
	echo ""
	echo "0a. Install latest R"
	echo "==================="
	apt-get install -y --no-install-recommends software-properties-common dirmngr
	wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
	add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"
	apt-get install -y --no-install-recommends r-base

	echo ""
	echo "0b. Install R modules"
	echo "====================="

	# Install BioConductor
	Rscript -e 'install.packages("BiocManager", repos="https://cloud.r-project.org")'
	Rscript -e 'BiocManager::install(version = "3.22")'
	# Install DEXSeq
	Rscript -e 'BiocManager::install("DEXSeq")'	

	echo ""
	echo "1. Download / install bwa"
	echo "========================="
	cd /src/zipped
	wget -q https://github.com/lh3/bwa/archive/refs/tags/v0.7.19.tar.gz -O bwa-v0.7.19.tar.gz
	cd /src
	tar -zxf zipped/bwa-v0.7.19.tar.gz
	cd bwa-0.7.19/
	cp Makefile Makefile.old
	
	# Strip out hardcoded CC and CFLAGS to use our own
	cat Makefile.old | grep -v "^CC=" | grep -v "^CFLAGS=" > Makefile
	
	make -j$MAKE_JOBS
	strip -g bwa
	cp -v bwa /opt/bin
	cp -v qualfa2fq.pl /opt/bin
	cp -v xa2multi.pl /opt/bin
	cp -v bwa.1 /opt/man
	
	echo ""
	echo "2. Download / install bwa-mem2"
	echo "=============================="
	cd /src/zipped
	wget -q https://github.com/bwa-mem2/bwa-mem2/releases/download/v2.3/Source_code_including_submodules.tar.gz -O bwa-mem2-v2.3.tar.gz
	cd /src
	tar -zxf zipped/bwa-mem2-v2.3.tar.gz
	cd bwa-mem2-2.3
	
	# bwa-mem2 Patch 1
	cp ext/safestringlib/safeclib/abort_handler_s.c ext/safestringlib/safeclib/abort_handler_s.c.old
	cat ext/safestringlib/safeclib/abort_handler_s.c.old | \
		sed 's/#include "safeclib_private.h"/#include <stdlib.h>\n#include "safeclib_private.h"/g' > ext/safestringlib/safeclib/abort_handler_s.c
		
	# bwa-mem2 Patch 2
	cp ext/safestringlib/safeclib/strcasecmp_s.c ext/safestringlib/safeclib/strcasecmp_s.c.old
	cat ext/safestringlib/safeclib/strcasecmp_s.c.old | \
		sed 's/#include "safeclib_private.h"/#include <ctype.h>\n#include "safeclib_private.h"/g' > ext/safestringlib/safeclib/strcasecmp_s.c
	
	# bwa-mem2 Patch 3
	cp ext/safestringlib/safeclib/strcasestr_s.c ext/safestringlib/safeclib/strcasestr_s.c.old
	cat ext/safestringlib/safeclib/strcasestr_s.c.old | \
		sed 's/#include "safeclib_private.h"/#include <ctype.h>\n#include "safeclib_private.h"/g' > ext/safestringlib/safeclib/strcasestr_s.c
	
	make -j$MAKE_JOBS
	strip -g bwa-mem2*
	cp -v bwa-mem2* /opt/bin
		
	echo ""
	echo "3a. Download / install bwa-meth - toolshed"
	echo "=========================================="
	cd /src/zipped
	wget -q https://pypi.python.org/packages/source/t/toolshed/toolshed-0.4.0.tar.gz -O toolshed-0.4.0.tar.gz
	cd /src
	tar -zxf zipped/toolshed-0.4.0.tar.gz
	cd toolshed-0.4.0
	python setup.py install
	
	echo ""
	echo "3b. Download / install bwa-meth"
	echo "=========================================="
	cd /src/zipped
	wget -q https://github.com/brentp/bwa-meth/archive/master.zip -O bwa-meth.zip
	cd /src
	unzip zipped/bwa-meth.zip
	cd bwa-meth-master
	cp -v bwameth.py /opt/bin
	ln -sv /opt/bin/bwameth.py /opt/bin/bwameth
	
	echo ""
	echo "4. Download / install samtools"
	echo "=============================="
	cd /src/zipped
	wget -q https://github.com/samtools/samtools/releases/download/1.23/samtools-1.23.tar.bz2 -O samtools-1.23.tar.bz2
	cd /src
	tar -jxf zipped/samtools-1.23.tar.bz2
	cd samtools-1.23
	./configure --prefix=/opt
	make -j$MAKE_JOBS
	make install
	strip -g /opt/bin/samtools
	
	echo ""
	echo "5a. Download / install sambamba - ldc"
	echo "====================================="
	cd /src/zipped
	wget -q https://github.com/ldc-developers/ldc/releases/download/v1.42.0-beta3/ldc2-1.42.0-beta3-linux-x86_64.tar.xz -O ldc2-1.42.0-beta3-linux-x86_64.tar.xz
	cd /src
	tar -xf zipped/ldc2-1.42.0-beta3-linux-x86_64.tar.xz
	
	# Temporarily add ldc2 to the path
	PATH=/src/ldc2-1.42.0-beta3-linux-x86_64/bin:$PATH
	LIBRARY_PATH=/src/ldc2-1.42.0-beta3-linux-x86_64/lib
	
	echo ""
	echo "5b. Download / install sambamba"
	echo "==============================="
	cd /src/zipped
	wget -q https://github.com/biod/sambamba/archive/refs/heads/master.zip -O sambamba-master.zip
	cd /src
	unzip zipped/sambamba-master.zip
	cd sambamba-master
	CC=gcc-14 make release
	strip -g bin/sambamba-1.0.1
	cp -v bin/sambamba-1.0.1 /opt/bin/sambamba
	
	echo ""
	echo "6. Download / install seqkit"
	echo "============================"
	cd /src/zipped
	wget -q https://github.com/shenwei356/seqkit/archive/refs/tags/v2.12.0.tar.gz -O seqkit-v2.12.0.tar.gz
	cd /src
	tar -zxf zipped/seqkit-v2.12.0.tar.gz
	cd seqkit-2.12.0/seqkit
	go build
	strip -g seqkit
	cp -v seqkit /opt/bin
	
	echo ""
	echo "6. Download / install methyldackel"
	echo "=================================="
	cd /src/zipped
	wget -q https://github.com/dpryan79/MethylDackel/archive/refs/tags/0.6.1.tar.gz -O methyldackel-0.6.1.tar.gz
	cd /src
	tar -zxf zipped/methyldackel-0.6.1.tar.gz
	cd MethylDackel-0.6.1/
	make -j$MAKE_JOBS LIBBIGWIG=/lib/x86_64-linux-gnu/libBigWig.a
	strip -g MethylDackel
	cp -v MethylDackel /opt/bin
	ln -s /opt/bin/MethylDackel /opt/bin/methyldackel
	
	echo ""
	echo "7. Download / install minimap"
	echo "=================================="
	cd /src/zipped
	wget -q https://github.com/lh3/minimap2/archive/refs/tags/v2.30.tar.gz -O minimap2-v2.30.tar.gz
	cd /src
	tar -zxf zipped/minimap2-v2.30.tar.gz
	cd minimap2-2.30
	cp Makefile Makefile.old
	# Strip out hardcoded CFLAGS to use our own
	cat Makefile.old | grep -v "^CFLAGS=" > Makefile
	make -j$MAKE_JOBS
	strip -g minimap2
	cp -v minimap2 /opt/bin
	
	echo ""
	echo "8. Download / install bedtools2"
	echo "==============================="
	cd /src/zipped
	wget -q https://github.com/arq5x/bedtools2/archive/refs/heads/master.zip -O bedtools2-master.zip
	cd /src
	unzip zipped/bedtools2-master.zip
	cd bedtools2-master
	cp Makefile Makefile.old
	# Strip out hardcoded compiler name to use our own
	cat Makefile.old | sed 's/= g++/= g++-14/g' > Makefile
	make -j$MAKE_JOBS
	strip -g bin/bedtools
	cp -v bin/* /opt/bin
	
	echo ""
	echo "9. Install bam-readcount"
	echo "========================"
	cd /src/zipped
	wget -q https://github.com/genome/bam-readcount/archive/refs/heads/master.zip -O bam-readcount-master.zip
	cd /src
	unzip zipped/bam-readcount-master.zip
	cd bam-readcount-master
	mkdir build
	cd build
	cmake ..
	# This does not like parallel builds - it ends up out of sequence...
	make
	strip -g bin/bam-readcount
	cp -v bin/bam-readcount /opt/bin
	
	echo ""
	echo "10. Install hisat2"
	echo "=================="
	cd /src/zipped
	wget -q https://cloud.biohpc.swmed.edu/index.php/s/fE9QCsX3NH4QwBi/download -O hisat2-2.2.1.zip
	cd /src
	unzip zipped/hisat2-2.2.1.zip
	cd hisat2-2.2.1
	cp Makefile Makefile.old
	# Strip out hardcoded compiler name to use our own
	cat Makefile.old | \
		sed 's/CC = /CC = gcc-14 #/g' | \
		sed 's/CPP = /CPP = g++-14 #/g' | \
		sed 's/RELEASE_FLAGS  =/RELEASE_FLAGS  = $(CFLAGS) /g' > Makefile
	make -j$MAKE_JOBS
	strip -g hisat2-align-l
	strip -g hisat2-align-s
	strip -g hisat2-build-l
	strip -g hisat2-align-s
	strip -g hisat2-inspect-l
	strip -g hisat2-inspect-s
	strip -g hisat2-repeat
	cp -v hisat2 hisat2-align* hisat2-inspect* hisat2-repeat hisat2_*.py extract_*.py /opt/bin
	
	echo ""
	echo "11. Install stringtie"
	echo "======================"
	cd /src/zipped
	wget -q https://ccb.jhu.edu/software/stringtie/dl/stringtie-3.0.3.tar.gz -O stringtie-3.0.3.tar.gz
	cd /src
	tar -zxf zipped/stringtie-3.0.3.tar.gz
	cd stringtie-3.0.3/
	make -j$MAKE_JOBS release
	strip -g stringtie
	cp -v stringtie /opt/bin
	cp -v prepDE.py3 /opt/bin/prepDE.py
	
	echo ""
	echo "12. Install gffcompare"
	echo "======================"
	cd /src/zipped
	wget -q https://ccb.jhu.edu/software/stringtie/dl/gffcompare-0.12.9.tar.gz -O gffcompare-0.12.9.tar.gz
	cd /src
	tar -zxf zipped/gffcompare-0.12.9.tar.gz
	cd gffcompare-0.12.9
	make -j$MAKE_JOBS
	strip -g gffcompare
	strip -g trmap
	cp -v gffcompare /opt/bin
	cp -v trmap /opt/bin
	
	echo ""
	echo "13. Install htseq"
	echo "================="
	pip3 install HTSeq --break-system-packages
	
	echo ""
	echo "14. Install picard"
	echo "=================="
	cd /src/zipped
	wget -q https://github.com/broadinstitute/picard/releases/download/3.4.0/picard.jar -O picard-3.4.0.jar
	cd /src
	cp -v zipped/picard-3.4.0.jar /opt/bin/picard.jar
	# We also set up a "java -jar picard.jar" helper alias
	# via an entry in the post-install %environment section
	
	echo ""
	echo "15a. Install flexbar - seqan"
	echo "============================"
	#cd /src/zipped
	#wget -q https://github.com/seqan/seqan/archive/refs/tags/seqan-v2.5.2.tar.gz -O seqan-v2.5.2.tar.gz 
	#cd /src
	#tar -zxf zipped/seqan-v2.5.2.tar.gz
	#cd seqan-seqan-v2.5.2
	#mkdir build
	#cd build
	#cmake ..
	#make -j$MAKE_JOBS
	#cd ../build/bin/ 
	#/bin/ls | grep -v ^demo | grep -v ^test | while read B
	#do
	#	strip -g $B
	#	cp -v $B /opt/bin
	#done
	
	#echo ""
	#echo "15b. Install flexbar - Intel threading blocks"
	#echo "============================================="
	#cd /src/zipped
	#wget -q https://github.com/uxlfoundation/oneTBB/archive/refs/tags/4.4.6.tar.gz -O tbb-4.4.6.tar.gz
	#cd /src
	#tar -zxf zipped/tbb-4.4.6.tar.gz
	#cd oneTBB-4.4.6
	
	# Patch for GCC13+
	# Found here: https://github.com/bambulab/BambuStudio/pull/1882/changes/d3459cb1b9f791531fe24b0558c581117243eade
	#cp include/tbb/task.h include/tbb/task.h.old
	#cat include/tbb/task.h.old | \
	#	sed 's/task\* next_offloaded\;/tbb\:\:task\* next_offloaded\;/g' > include/tbb/task.h
		
	# Mangle CXXFLAGS to allow compiling the old code against new GCC
	#CXXFLAGS="-O3 -march=znver4 -pipe -std=c++14" make
	
	#cp -v build/linux_intel64_gcc_cc13_libc2.39_kernel6.8.0_release/*.so /opt/lib
	#cp -v build/linux_intel64_gcc_cc13_libc2.39_kernel6.8.0_release/*.so.2 /opt/lib
	#cp -v -a include/tbb /opt/include
	
	# Reset CXXFLAGS back again
	#export CXXFLAGS="$CFLAGS"
	
	#echo ""
	#echo "15c. Install flexbar"
	#echo "===================="
	#cd /src/zipped
	#wget -q https://github.com/seqan/flexbar/archive/refs/tags/v3.5.0.tar.gz -O flexbar-v3.5.0.tar.gz
	#cd /src
	#tar -zxf zipped/flexbar-v3.5.0.tar.gz 
	#cd flexbar-3.5.0
	# Copy in the seqan 'library' - which is C++ code in header files...
	#cp -a /src/seqan-seqan-v2.5.2/include .
	#cmake .
	#make -j$MAKE_JOBS
	
	echo ""
	echo "16. Install regtools"
	echo "===================="
	cd /src/zipped
	wget -q https://github.com/griffithlab/regtools/archive/refs/tags/1.0.0.tar.gz -O regtools-1.0.0.tar.gz
	cd /src
	tar -zxf zipped/regtools-1.0.0.tar.gz
	cd regtools-1.0.0
	mkdir build
	cd build
	cmake ..
	make -j$MAKE_JOBS
	strip -g regtools
	cp -v regtools /opt/bin
	
	echo ""
	echo "17. Install rseqc"
	echo "================="
	cd /src/zipped
	wget -q https://sourceforge.net/projects/rseqc/files/RSeQC-5.0.1.tar.gz/download -O RSeQC-5.0.1.tar.gz
	cd /src
	# This tar file was created with AD/Domain user owner/group info
	# ignore it when extracting...
	tar --no-same-owner -zxf zipped/RSeQC-5.0.1.tar.gz
	cd RSeQC-5.0.1/
	python setup.py install
	
	echo ""
	echo "18. Install bcftools"
	echo "====================="
	cd /src/zipped
	wget -q https://github.com/samtools/bcftools/releases/download/1.23/bcftools-1.23.tar.bz2 -O bcftools-1.23.tar.bz2
	cd /src
	tar -jxf zipped/bcftools-1.23.tar.bz2
	cd bcftools-1.23
	./configure --prefix=/opt --enable-libgsl
	make
	make install
	
	echo ""
	echo "19. Install bamutil"
	echo "==================="
	cd /src/zipped
	wget -q https://github.com/statgen/bamUtil/archive/refs/tags/v1.0.15.tar.gz -O bamutil-1.0.15.tar.gz
	cd /src
	tar -zxf zipped/bamutil-1.0.15.tar.gz
	cd bamUtil-1.0.15
	
	# Public git://github.com calls no longer work in 2026+
	# Patch it out to https instead.
	cp Makefile.inc Makefile.inc.old
	cat Makefile.inc.old | sed 's/git clone git/git clone https/g' > Makefile.inc
	
	CFLAGS="$BASE_CFLAGS_ALT -I/opt/include"
	make cloneLib
	make
	make install INSTALLDIR=/opt/bin
	strip -g /opt/bin/bam
	CFLAGS="$BASE_CFLAGS -I/opt/include"
	
	echo ""
	echo "20. Install bowtie2"
	echo "==================="
	cd /src/zipped
	wget -q https://github.com/BenLangmead/bowtie2/archive/refs/tags/v2.5.5.tar.gz -O bowtie2-2.5.5.tar.gz
	cd /src
	tar -zxf zipped/bowtie2-2.5.5.tar.gz
	cd bowtie2-2.5.5
	mkdir build
	cd build
	cmake ..
	make -j$MAKE_JOBS
	strip -g bowtie2-*
	cp -v bowtie2-* /opt/bin
	cd ..
	cp -v bowtie2 /opt/bin
	cp -v bowtie2-inspect /opt/bin
	cp -v bowtie2-build /opt/bin
	
	echo ""
	echo "21. Install bcl_convert"
	echo "======================="
	cd /src
	if [ -s /mnt/bcl-convert-4.4.6-2.el8.x86_64.rpm ]
	then
		mkdir bcl-convert
		cd bcl-convert
		rpm2cpio /mnt/bcl-convert-4.4.6-2.el8.x86_64.rpm | cpio -idmv
		cp usr/bin/bcl-convert /opt/bin
	else
		echo "WARNING!!!! - Unable to find bcl_convert.rpm - this will be skipped"
	fi
	
	# Remove all src packages
	echo ""
	echo "Cleaning up downloaded src tree"
	echo "=================================="
	cd
	rm -rf /src
	pip3 cache purge
	
	echo ""
	echo "7. All done"

%environment
	export PATH=/opt/bin:$PATH
	export LD_LIBRARY_PATH=/opt/lib:$LD_LIBRARY_PATH
	export CFLAGS="-O -I/opt/include"
	export CXXFLAGS="$CFLAGS"
	export CC=gcc-14
	export CXX=g++-14
	export FC=gfortran-14
	export OMPI_CC=gcc-14
	export MANPATH=/opt/man
	alias picard="java -jar /opt/bin/picard.jar"

%runscript



Run file

You should source this file in order to use the container.run command. This should have the current container image name set as the IMAGE_NAME parameter:

#!/bin/bash

module load apptainer

IMAGE_NAME=/nobackup/shared/containers/bioapps.2026.02.sif

container.run() {
	# Run a command inside the container...
	# automatically bind the /scratch and /nobackup dirs
	# pass through any additional parameters given on the command line
	apptainer exec --bind /scratch:/scratch --bind /nobackup:/nobackup ${IMAGE_NAME} $@
}


Back to Software