====== Bioapps Container ====== This container is intended to collect //most// of the commonly used Bioinformatics software packages used on Rocket and Comet, and provide them in a single, easy to use format, without all of the complexity of many //module load// and //module unload// commands. Jump straight to the Bioapps container software lists: * [[#individual_software_help|Main software packages included]] * [[#python_modules|Python modules]] * [[#r_libraries|R libraries]] ---- ===== Why A Container? ===== With almost limitless combinations of bioinformatics tools that can be used together it is very difficult to ensure that any given set of software modules which have been provisioned on Comet can be used alongside any other set of modules. Multiple versions of Python, C compilers and runtimes, and dependencies mean that almost every single unique set of software which is intended to be used together needs to be validated and tested to make sure that no modules conflict - ''bwa'' may need version **X** of a runtime, but ''samtools'' could require version **Y** of the //same// runtime, and therefore it is impossible (or at least strongly inadvisable due to unpredictable behaviour in such a scenario) to use both tools in the same pipeline at once. As more and more modules are added, this becomes a //combinatorial explosion// of software and versions problem… which some of our users have already experienced. By building all of the common bioinformatics tools in one container, with //one// C compiler, //one// version of Python, and //one// set of their dependencies we can guarantee that this set of software will work without conflicting with each other - and the unknown side affects of such version conflicts. It also means we can use one set of tools on a local workstation or HPC without changing our workflow. ---- ===== Running on Comet ===== The Bioapps container is stored in the ''/nobackup/shared/containers'' directory and is accessible to __all__ users of Comet. You do //not// need to take a copy of the container file; it should be left in its original location. You can find the container files here: * ''/nobackup/shared/containers/bioapps.2026.03.sif'' * ''/nobackup/shared/containers/bioapps.2026.02.sif'' We //normally// recommend using the latest (date) version of the container. **Container Image Versions** We //may// reference a specific container file, such as **bioapps.2026.03.sif**, but you should always check whether this is the most recent version of the container available. Simply ''ls'' the ''/nobackup/shared/containers'' directory and you will be able to see if there are any newer versions listed. We have provided a convenience script that will automate **all** of steps needed to run applications inside the container, and access your ''$HOME'', ''/scratch'' and ''/nobackup'' directories to just two simple commands. * ''/nobackup/shared/containers/bioapps.2026.03.sh'' There is a corresponding ''.sh'' script for //each version// of the container image we make available. Just ''source'' this file and it will take care of loading ''apptainer'', setting up your ''bind'' directories and calling the ''exec'' command for you - and give you a single command called ''container.run'' (instead of the really long //apptainer exec// command) to then //run// anything you want inside the container, for example - to run our ''bowtie2'' command as before: $ source /nobackup/shared/containers/bioapps.2026.03.sh $ container.run bowtie2 -U /nobackup/proj/my_project/fastq_data/file1.fq You can continue to use the ''container.run'' command as many times as you need in the same script or same bash session: $ source /nobackup/shared/containers/bioapps.2026.03.sh $ container.run bowtie2 -U /nobackup/proj/my_project/fastq_data/file1.fq $ container.run samtools $ container.run bwameth $ container.run hisat2 --version $ container.run python3 $ container.run R We **strongly** recommend that you use this helper script and the ''container.run'' command to run software from inside the container as it will always ensure that you have correctly set up the ''bind'' directories for you and you are using the correct container version. ---- ===== Individual Software Help ===== All of the following software packages are available within the Bio apps container. Please check the //Included From// field to see which version of the container the software was introduced from. ^ Title ^ Included from ^ Files ^ Source Link ^ Description ^ | bamutil | 2026.02+ | ''bam'' | https://github.com/statgen/bamUtil/ | | | bcftools | 2026.02+ | ''bcftools'' | https://github.com/samtools/bcftools | The plugins (e.g. //counts.so//, //contrast.so//, //prune.so//, etc) for ''bcftool'' are installed under ''/opt/libexec/bcftools'' and will be used automatically. | | bcl_convert (1) | 2026.02+ | ''bcl-convert'' | https://emea.support.illumina.com/sequencing/sequencing_software/bcl-convert/downloads.html | Note that the use of ''bcl-convert'' is subject to the following licensing restrictions:\\ \\ - The software can only be used //"... for the purpose of processing and analyzing data generated from an Illumina genetic sequencing instrument owned and operated solely by (the University)"//\\ \\ - The software is only to be used for research purposes\\ \\ - The software can only be used with data generated from the Illumina instrument, and not any data generated from other sources | | bowtie2 | 2026.02+ | ''bowtie2''\\ ''bowtie2-align-l''\\ ''bowtie2-align-s''\\ ''bowtie2-build-l''\\ ''bowtie2-build-s''\\ ''bowtie2-inspect-l''\\ ''bowtie2-inspect-s'' | https://github.com/BenLangmead/bowtie2 | | | bwa | 2026.02+ | ''bwa''\\ ''qualfa2fq.pl''\\ ''xa2multi.pl'' | https://github.com/lh3/bwa | | | bwa-mem2 | 2026.02+ | ''bwa-mem2''\\ ''bwa-mem2.avx''\\ ''bwa-mem2.avx2''\\ ''bwa-mem2.avx512bw''\\ ''bwa-mem2.sse41''\\ ''bwa-mem2.sse42'' | https://github.com/bwa-mem2/bwa-mem2 | Calling ''bwa-mem2'' will automatically select the most optimal version (e.g. ''avx'', ''avx2'' etc). | | bwa-meth | 2026.02+ | ''bwameth.py''\\ ''bwameth'' | https://github.com/brentp/bwa-meth | The file ''bwameth'' is provided as a symbolic link to ''bwameth.py'' - you do not need to call it via Python, just ''bwameth'' is enough. | | samtools | 2026.02+ | ''samtools'' | https://github.com/samtools/samtools | | | sambamba | 2026.02+ | ''sambamba'' | https://github.com/biod/sambamba | | | seqkit | 2026.02+ | ''seqkit'' | https://github.com/shenwei356/seqkit | | | methyldackel | 2026.02+ | ''MethylDackel''\\ ''methyldackel'' | https://github.com/dpryan79/MethylDackel | The file ''methyldackel'' is provided as a symbolic link to ''MethylDackel'' for convenience / simplification of capitalisation. | | minimap2 | 2026.02+ | ''minimap2'' | https://github.com/lh3/minimap2 | | | bedtools2 | 2026.02+ | ''annotateBed''\\ ''bamToBed''\\ ''bamToFastq''\\ ''bed12ToBed6''\\ ''bedToBam''\\ ''bedToIgv''\\ ''bedpeToBam''\\ ''bedtools''\\ ''closestBed''\\ ''clusterBed''\\ ''complementBed''\\ ''coverageBed''\\ ''expandCols''\\ ''fastaFromBed''\\ ''flankBed''\\ ''genomeCoverageBed''\\ ''getOverlap''\\ ''groupBy''\\ ''intersectBed''\\ ''linksBed''\\ ''mapBed''\\ ''maskFastaFromBed''\\ ''mergeBed''\\ ''multiBamCov''\\ ''multiIntersectBed''\\ ''nucBed''\\ ''pairToBed''\\ ''pairToPair''\\ ''randomBed''\\ ''shiftBed''\\ ''shuffleBed''\\ ''slopBed''\\ ''sortBed''\\ ''subtractBed''\\ ''tagBam''\\ ''unionBedGraphs''\\ ''windowBed''\\ ''windowMaker'' | https://github.com/arq5x/bedtools2 | | | bam-readcount | 2026.02+ | ''bam-readcount'' | https://github.com/genome/bam-readcount | | | hisat2 | 2026.02+ | ''hisat2''\\ ''hisat2-align-l''\\ ''hisat2-align-s''\\ ''hisat2-inspect''\\ ''hisat2-inspect-l''\\ ''hisat2-inspect-s''\\ ''hisat2-repeat''\\ ''hisat2_extract_exons.py''\\ ''hisat2_extract_snps_haplotypes_UCSC.py''\\ ''hisat2_extract_snps_haplotypes_VCF.py''\\ ''hisat2_extract_splice_sites.py''\\ ''hisat2_read_statistics.py''\\ ''hisat2_simulate_reads.py''\\ ''extract_exons.py''\\ ''extract_splice_sites.py'' | https://cloud.biohpc.swmed.edu/index.php | | | stringtie | 2026.02+ | ''stringtie''\\ ''prepDE.py'' | http://ccb.jhu.edu/software/stringtie/dl | | | gffcompare | 2026.02+ | ''gffcompare''\\ ''trmap'' | http://ccb.jhu.edu/software/stringtie/dl | | | htseq-count | 2026.02+ | ''htseq-count''\\ ''htseq-count-barcodes''\\ ''htseq-qa'' | https://pypi.org/project/HTSeq/ | The HTSeq installer places all of the files under ''/usr/local/bin'', but this is also added to the ''$PATH''. | | picard | 2026.02+ | ''picard.jar''\\ shell alias ''picard'' | https://github.com/broadinstitute/picard/releases/download | The shell alias ''alias picard="java -jar /opt/bin/picard.jar"'' is provided to allow you to run this with the single command ''picard''. | | seqan-library | 2026.02+ | ''alf''\\ ''bam2roi''\\ ''dfi''\\ ''fx_bam_coverage''\\ ''fx_fastq_stats''\\ ''gustaf''\\ ''gustaf_mate_joining''\\ ''insegt''\\ ''mason_frag_sequencing''\\ ''mason_genome''\\ ''mason_materializer''\\ ''mason_methylation''\\ ''mason_simulator''\\ ''mason_splicing''\\ ''mason_tests''\\ ''mason_variator''\\ ''micro_razers''\\ ''pair_align''\\ ''param_chooser''\\ ''rabema_build_gold_standard''\\ ''rabema_do_search''\\ ''rabema_evaluate''\\ ''rabema_prepare_sam''\\ ''razers''\\ ''razers3''\\ ''razers3_quality2prob''\\ ''razers3_simulate_reads''\\ ''rep_sep''\\ ''roi_feature_projection''\\ ''roi_plot_thumbnails''\\ ''s4_join''\\ ''s4_search''\\ ''sak''\\ ''sam2matrix''\\ ''samcat''\\ ''seqan_tcoffee''\\ ''seqcons2''\\ ''sgip''\\ ''splazers''\\ ''stellar''\\ ''tree_recon''\\ ''yara_indexer''\\ ''yara_mapper'' | https://github.com/seqan/seqan | These are the sample tools provided with the seqn-library installation. We have //not// installed all of the //test_// and //demo_// files. | | regtools | 2026.02+ | ''regtools'' | https://github.com/griffithlab/regtools | | | rseqc | 2026.02+ | All scripts as listed [[https://rseqc.sourceforge.net/#usage-information|here]] | https://rseqc.sourceforge.net | The RSeQC installer places all files in ''/usr/local/bin'' by default - this is added to ''$PATH'' so you should still be able to call ''bam2fq.py'' without giving the full path, for example. | | Python | 2026.02+ | ''python3'' | | Currently uses version Python 3.12 | | R | 2026.03+ | ''R'', ''Rscript'' | | Currently version 4.5.2 | | GCC | 2026.02+ | ''gcc-14'', ''g++-14'', ''gfortran-14'' | | Please use the existing ''CFLAGS'', ''CXXFLAGS'' and ''CPPFLAGS'' environment variables which were set during the installation of the container; this will ensure the most appropriate performance optimisation flags are retained for any additional software you compile. | **Note:** All binaries are compiled for the AMD Epyc CPU architecture of Comet with the ''CFLAGS=-O3 -march=znver5 -pipe'' flags with GCC 14, on top of any existing optimisation flags set by each application. All binaries are also stripped of debugging symbols with ''strip -g'' to reduce their on-disk and in-memory size requirements. (1) - ''bcl-convert'' is a //vendor// provided binary (proprietary to Illumina) and unlike all other listed software has //not// been recompiled for Comet. ---- ==== Python Modules ==== In addition to the standalone applications listed above, the following **Python** modules are installed (i.e. they are available if you use ''python3'' installed from the container with a normal ''import module'' syntax in your code): ^ Module Name ^ Available From ^ Link ^ | HTSeq | 2026.02+ | https://htseq.readthedocs.io/en/latest/ | | RSeQC | 2026.02+ | https://rseqc.sourceforge.net/ | | bx_python | 2026.02+ | https://github.com/bxlab/bx-python | | numpy | 2026.02+ | https://numpy.org/ | | pybigwig | 2026.02+ | https://github.com/deeptools/pyBigWig | | pysam | 2026.02+ | https://github.com/pysam-developers/pysam | | toolshed | 2026.02+ | https://travis-ci.org/brentp/toolshed | This list only includes modules which have been explicitly installed. The standard //Python// built-ins are still available: ''sqlite'', ''json'', etc. ---- ==== R Libraries ==== In addition to the standalone applications listed above, the following **R** libraries are installed (i.e. they are available to use in the ''R'' and ''Rscript'' commands started from the container, and by using the normal ''library(module)'' syntax in your code): ^ Library Name ^ Available From ^ Link ^ | AnnotationDbi | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/AnnotationDbi.html | | BH | 2026.03+ | https://cran.r-project.org/web/packages/BH/index.html | | Biobase | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/Biobase.html | | BiocFileCache | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/BiocFileCache.html | | BiocGenerics | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/BiocGenerics.html | | BiocManager | 2026.03+ | https://github.com/Bioconductor/BiocManager | | BiocParallel | 2026.03+ | https://www.bioconductor.org/packages/release/bioc/html/BiocParallel.html | | BiocVersion | 2026.03+ | https://www.bioconductor.org/packages/release/bioc/html/BiocVersion.html | | Biostrings | 2026.03+ | https://www.bioconductor.org/packages/release/bioc/html/Biostrings.html | | DBI | 2026.03+ | https://cran.r-project.org/web/packages/DBI/index.html | | DESeq2 | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/DESeq2.html | | DEXSeq | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/DEXSeq.html | | DelayedArray | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/DelayedArray.html | | GenomicRanges | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html | | IRanges | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/IRanges.html | | KEGGREST | 2026.03+ | https://www.bioconductor.org/packages/release/bioc/html/KEGGREST.html | | MatrixGenerics | 2026.03+ | https://bioconductor.org/packages/devel/bioc/html/MatrixGenerics.html | | R6 | 2026.03+ | https://cran.r-project.org/web/packages/R6/index.html | | RColorBrewer | 2026.03+ | https://cran.r-project.org/web/packages/RColorBrewer/index.html | | RSQLite | 2026.03+ | https://cran.r-project.org/web/packages/RSQLite/index.html | | Rcpp | 2026.03+ | https://cran.r-project.org/web/packages/Rcpp/index.html | | RcppArmadillo | 2026.03+ | https://cran.r-project.org/web/packages/RcppArmadillo/index.html | | Rhtslib | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/Rhtslib.html | | Rsamtools | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/Rsamtools.html | | S4Arrays | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/S4Arrays.html | | S4Vectors | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/S4Vectors.html | | S7 | 2026.03+ | https://cran.r-project.org/web/packages/S7/index.html | | Seqinfo | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/Seqinfo.html | | SparseArray | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/SparseArray.html | | SummarizedExperiment | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html | | XML | 2026.03+ | https://cran.r-project.org/web/packages/XML/index.html | | XVector | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/XVector.html | | abind | 2026.03+ | https://cran.r-project.org/web/packages/abind/index.html | | annotate | 2026.03+ | https://www.bioconductor.org/packages/release/bioc/html/annotate.html | | askpass | 2026.03+ | https://cran.r-project.org/web/packages/askpass/index.html | | biomaRt | 2026.03+ | https://bioconductor.org/packages/release/bioc/html/biomaRt.html | | bit | 2026.03+ | https://cran.r-project.org/web/packages/bit/index.html | | bit64 | 2026.03+ | https://cran.r-project.org/web/packages/bit64/index.html | | bitops | 2026.03+ | https://cran.r-project.org/web/packages/bitops/index.html | | blob | 2026.03+ | https://cran.r-project.org/web/packages/blob/index.html | | cachem | 2026.03+ | https://cran.r-project.org/web/packages/cachem/index.html | | cli | 2026.03+ | https://cran.r-project.org/web/packages/cli/index.html | | cpp11 | 2026.03+ | https://cran.r-project.org/web/packages/cpp11/index.html | | crayon | 2026.03+ | https://cran.r-project.org/web/packages/crayon/index.html | | curl | 2026.03+ | https://cran.r-project.org/web/packages/curl/index.html | | dbplyr | 2026.03+ | https://cran.r-project.org/web/packages/dbplyr/index.html | | dplyr | 2026.03+ | https://cran.r-project.org/web/packages/dplyr/index.html | | farver | 2026.03+ | https://cran.r-project.org/web/packages/farver/index.html | | fastmap | 2026.03+ | https://cran.r-project.org/web/packages/fastmap/index.html | | filelock | 2026.03+ | https://cran.r-project.org/web/packages/filelock/index.html | | formatR | 2026.03+ | https://cran.r-project.org/web/packages/formatR/index.html | | futile.logger | 2026.03+ | https://cran.r-project.org/web/packages/futile.logger/index.html | | futile.options | 2026.03+ | https://cran.r-project.org/web/packages/futile.options/index.html | | genefilter | 2026.03+ | https://www.bioconductor.org/packages/release/bioc/html/genefilter.html | | geneplotter | 2026.03+ | https://www.bioconductor.org/packages/release/bioc/html/geneplotter.html | | generics | 2026.03+ | https://cran.r-project.org/web/packages/generics/index.html | | ggplot2 | 2026.03+ | https://cran.r-project.org/web/packages/ggplot2/index.html | | glue | 2026.03+ | https://cran.r-project.org/web/packages/glue/index.html | | gtable | 2026.03+ | https://cran.r-project.org/web/packages/gtable/index.html | | hms | 2026.03+ | https://cran.r-project.org/web/packages/hms/index.html | | httr | 2026.03+ | https://cran.r-project.org/web/packages/httr/index.html | | httr2 | 2026.03+ | https://cran.r-project.org/web/packages/httr2/index.html | | hwriter | 2026.03+ | https://cran.r-project.org/web/packages/hwriter/index.html | | isoband | 2026.03+ | https://cran.r-project.org/web/packages/isoband/index.html | | jsonlite | 2026.03+ | https://cran.r-project.org/web/packages/jsonlite/index.html | | labeling | 2026.03+ | https://cran.r-project.org/web/packages/labeling/index.html | | lambda.r | 2026.03+ | https://cran.r-project.org/web/packages/lambda.r/index.html | | lifecycle | 2026.03+ | https://cran.r-project.org/web/packages/lifecycle/index.html | | locfit | 2026.03+ | https://cran.r-project.org/web/packages/locfit/index.html | | magrittr | 2026.03+ | https://cran.r-project.org/web/packages/magrittr/index.html | | matrixStats | 2026.03+ | https://cran.r-project.org/web/packages/matrixStats/index.html | | memoise | 2026.03+ | https://cran.r-project.org/web/packages/memoise/index.html | | mime | 2026.03+ | https://cran.r-project.org/web/packages/mime/index.html | | openssl | 2026.03+ | https://cran.r-project.org/web/packages/openssl/index.html | | pillar | 2026.03+ | https://cran.r-project.org/web/packages/pillar/index.html | | pkgconfig | 2026.03+ | https://cran.r-project.org/web/packages/pkgconfig/index.html | | png | 2026.03+ | https://cran.r-project.org/web/packages/png/index.html | | prettyunits | 2026.03+ | https://cran.r-project.org/web/packages/prettyunits/index.html | | progress | 2026.03+ | https://cran.r-project.org/web/packages/progress/index.html | | purrr | 2026.03+ | https://cran.r-project.org/web/packages/purrr/index.html | | rappdirs | 2026.03+ | https://cran.r-project.org/web/packages/rappdirs/index.html | | rlang | 2026.03+ | https://cran.r-project.org/web/packages/rlang/index.html | | scales | 2026.03+ | https://cran.r-project.org/web/packages/scales/index.html | | snow | 2026.03+ | https://cran.r-project.org/web/packages/snow/index.html | | statmod | 2026.03+ | https://cran.r-project.org/web/packages/statmod/index.html | | stringi | 2026.03+ | https://cran.r-project.org/web/packages/stringi/index.html | | stringr | 2026.03+ | https://cran.r-project.org/web/packages/stringr/index.html | | sys | 2026.03+ | https://cran.r-project.org/web/packages/sys/index.html | | tibble | 2026.03+ | https://cran.r-project.org/web/packages/tibble/index.html | | tidyr | 2026.03+ | https://cran.r-project.org/web/packages/tidyr/index.html | | tidyselect | 2026.03+ | https://cran.r-project.org/web/packages/tidyselect/index.html | | utf8 | 2026.03+ | https://cran.r-project.org/web/packages/utf8/index.html | | vctrs | 2026.03+ | https://cran.r-project.org/web/packages/vctrs/index.html | | viridisLite | 2026.03+ | https://cran.r-project.org/web/packages/viridisLite/index.html | | withr | 2026.03+ | https://cran.r-project.org/web/packages/withr/index.html | | xml2 | 2026.03+ | https://cran.r-project.org/web/packages/xml2/index.html | | xtable | 2026.03+ | https://cran.r-project.org/web/packages/xtable/index.html | This list only includes the R libaries which have been explicitly installed, or brought in as dependencies by other libraries. The //standard// R libraries are still available: ''base'', ''splines'', ''stats'', ''utils'', etc. ---- ===== Building Bioapps Container ===== **Important** This section is only relevant to RSE HPC staff or users wanting to understand how the container image is built. If you are intending to simply //use// the software you **do not** need to read this section - turn back now! **Build script:** * Note that the build script will automatically tag the container filename with ''YYYY.MM'' for a simple version naming scheme. * To install ''bcl-convert'' the build script must be run from a directory which has a copy of ''bcl-convert-4.4.6-2.el8.x86_64.rpm'' - this is //not// free to download - if it is not found then the installation will skip it. #!/bin/bash IMAGE_DATE=`date +%Y.%m` echo "Loading modules..." module load apptainer echo "" echo "Building container..." export APPTAINER_TMPDIR=/scratch echo "" echo "Container will have date suffix $IMAGE_DATE" # You must supply a copy of bc-convert*.rpm in this # folder below. If it is not present then the install # of this tool will be skipped. SOURCE_DIR=`pwd` BCL_RPM="bcl-convert-4.4.6-2.el8.x86_64.rpm" echo "" echo "Checking source files..." if [ -s "$SOURCE_DIR/$BCL_RPM" ] then echo "- Found - $SOURCE_DIR/$BCL_RPM" else echo "- WARNING - $SOURCE_DIR/$BCL_RPM is MISSING" echo "" echo "Press return to continue or Control+C to exit and fix" read fi apptainer build --bind $SOURCE_DIR:/mnt bioapps.$IMAGE_DATE.sif bioapps.def 2>&1 | tee bioapps.log **Container definition:** Bootstrap: docker From: ubuntu:noble #################################################################### # # Bio apps container # ================== # This is a runtime environment for a large set of bioinformatics tools. # Please see: # https://hpc.researchcomputing.ncl.ac.uk/dokuwiki/dokuwiki/doku.php?id=advanced:software:bioapps # # ====================================== # # NAME : WORKING # LINK # # ====================================== # bamutil : Yes # https://github.com/statgen/bamUtil/ # # bcftools : # https://github.com/samtools/bcftools # # bowtie2 : Yes # https://github.com/BenLangmead/bowtie2 # # Bwa : Yes # https://github.com/lh3/bwa # # Bwa-mem2 : Yes # https://github.com/bwa-mem2/bwa-mem2/releases/tag/v2.3 # # Bwa-meth : Yes # https://github.com/brentp/bwa-meth # # Samtools : Yes # https://github.com/samtools/samtools # # Sambamba : Yes # https://github.com/biod/sambamba # # Seqkit : Yes # https://github.com/shenwei356/seqkit # # methyldackel : Yes # https://github.com/dpryan79/MethylDackel # # minimap : Yes # https://github.com/lh3/minimap2 # # bedtools2 : Yes # https://github.com/arq5x/bedtools2/archive/refs/heads/master.zip # # bam-readcount : Yes # https://github.com/genome/bam-readcount # # hisat2 : Yes # https://cloud.biohpc.swmed.edu/index.php/s/fE9QCsX3NH4QwBi/download # # StringTie : Yes # http://ccb.jhu.edu/software/stringtie/dl # # gffcompare : Yes # http://ccb.jhu.edu/software/stringtie/dl # # htseq-count : Yes # https://pypi.python.org/packages/source/H/HTSeq # # picard : Yes # https://github.com/broadinstitute/picard/releases/download # # seqan-library : Yes # https://github.com/seqan/seqan # # regtools : Yes # https://github.com/griffithlab/regtools # # RSeQC : Yes # https://rseqc.sourceforge.net/#download-rseqc # #################################################################### %post # Prevent interactive prompts export DEBIAN_FRONTEND=noninteractive #################################################################### # # Basic system packages # #################################################################### # Update & install only necessary packages apt-get update apt-get install -y apt-utils wget autoconf cmake rpm2cpio cpio build-essential man-db tar unzip git aptitude golang-go python3-pip gcc-14 g++-14 gfortran-14 openmpi-bin openmpi-common libopenmpi-dev libgomp1 autoconf vim libhts-dev libncurses-dev libbz2-dev liblz4-dev openjdk-25-jre libbigwig-dev libgsl-dev libxml2-dev libssl-dev libpng-dev liblapack-dev ln -s /usr/bin/python3 /usr/bin/python # Clean up APT cache to save space apt-get clean # Any Python modules installed via pip go here # pip install NAME --break-system-packages # Remove any Python cache files after pip pip3 cache purge ################################################################################# # # This is all the custom stuff needed to build the various bioinformatics tools # ################################################################################# # This flag needs to be set to indicate which CPU architecture we # are optimising for. AMD_ARCH=1 if [ "$AMD_ARCH" = "1" ] then # Compiling on AMD Epyc export BASE_CFLAGS="-O3 -march=znver5 -pipe" export BASE_CFLAGS_ALT="-O3 -march=native -pipe" export MAKE_JOBS=8 else # Compiling on generic system export BASE_CFLAGS="-O" export BASE_CFLAGS_ALT="-O" export MAKE_JOBS=8 fi export CPPFLAGS="" export CFLAGS="$BASE_CFLAGS -I/opt/include" export CFLAGS_ALT="$BASE_CFLAGS_ALT -I/opt/include" export CXXFLAGS="$CFLAGS" export CC=gcc-14 export CXX=g++-14 export FC=gfortran-14 export LD_LIBRARY_PATH=/opt/lib:$LD_LIBRARY_PATH export PATH=/opt/bin:$PATH ############################################################################### # Tell R to use the newer version of GCC when it needs to compile. # R 'helpfully' ignores standard CC/CFLAG/etc environment variables and # uses its own mechanism for setting the C/C++ and optimisation flags to # use. Override those by writing /root/.R/Makevars instead. ############################################################################### mkdir -p /root/.R/ echo "CC=$CC" > /root/.R/Makevars echo "CXX=$CXX" >> /root/.R/Makevars echo "CFLAGS=$CFLAGS" >> /root/.R/Makevars echo "CXXFLAGS=$CFLAGS" >> /root/.R/Makevars echo "CMAKE_C_COMPILER=$CC" >> /root/.R/Makevars echo "CMAKE_CXX_COMPILER=$CXX" >> /root/.R/Makevars echo "F77=$FC" >> /root/.R/Makevars echo "" echo "Post-OS-install setup for Bio apps container" echo "============================================" # A download place for external libraries mkdir -p /src/zipped # Where installations go mkdir -p /opt/bin mkdir -p /opt/include mkdir -p /opt/lib mkdir -p /opt/man echo "" echo "0a. Install latest R" echo "===================" apt-get install -y --no-install-recommends software-properties-common dirmngr wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/" apt-get install -y --no-install-recommends r-base echo "" echo "0b. Install R modules" echo "=====================" # Install BioConductor Rscript -e 'install.packages("BiocManager", repos="https://cloud.r-project.org")' Rscript -e 'BiocManager::install(version = "3.22")' # Install DEXSeq Rscript -e 'BiocManager::install("DEXSeq")' echo "" echo "1. Download / install bwa" echo "=========================" cd /src/zipped wget -q https://github.com/lh3/bwa/archive/refs/tags/v0.7.19.tar.gz -O bwa-v0.7.19.tar.gz cd /src tar -zxf zipped/bwa-v0.7.19.tar.gz cd bwa-0.7.19/ cp Makefile Makefile.old # Strip out hardcoded CC and CFLAGS to use our own cat Makefile.old | grep -v "^CC=" | grep -v "^CFLAGS=" > Makefile make -j$MAKE_JOBS strip -g bwa cp -v bwa /opt/bin cp -v qualfa2fq.pl /opt/bin cp -v xa2multi.pl /opt/bin cp -v bwa.1 /opt/man echo "" echo "2. Download / install bwa-mem2" echo "==============================" cd /src/zipped wget -q https://github.com/bwa-mem2/bwa-mem2/releases/download/v2.3/Source_code_including_submodules.tar.gz -O bwa-mem2-v2.3.tar.gz cd /src tar -zxf zipped/bwa-mem2-v2.3.tar.gz cd bwa-mem2-2.3 # bwa-mem2 Patch 1 cp ext/safestringlib/safeclib/abort_handler_s.c ext/safestringlib/safeclib/abort_handler_s.c.old cat ext/safestringlib/safeclib/abort_handler_s.c.old | \ sed 's/#include "safeclib_private.h"/#include \n#include "safeclib_private.h"/g' > ext/safestringlib/safeclib/abort_handler_s.c # bwa-mem2 Patch 2 cp ext/safestringlib/safeclib/strcasecmp_s.c ext/safestringlib/safeclib/strcasecmp_s.c.old cat ext/safestringlib/safeclib/strcasecmp_s.c.old | \ sed 's/#include "safeclib_private.h"/#include \n#include "safeclib_private.h"/g' > ext/safestringlib/safeclib/strcasecmp_s.c # bwa-mem2 Patch 3 cp ext/safestringlib/safeclib/strcasestr_s.c ext/safestringlib/safeclib/strcasestr_s.c.old cat ext/safestringlib/safeclib/strcasestr_s.c.old | \ sed 's/#include "safeclib_private.h"/#include \n#include "safeclib_private.h"/g' > ext/safestringlib/safeclib/strcasestr_s.c make -j$MAKE_JOBS strip -g bwa-mem2* cp -v bwa-mem2* /opt/bin echo "" echo "3a. Download / install bwa-meth - toolshed" echo "==========================================" cd /src/zipped wget -q https://pypi.python.org/packages/source/t/toolshed/toolshed-0.4.0.tar.gz -O toolshed-0.4.0.tar.gz cd /src tar -zxf zipped/toolshed-0.4.0.tar.gz cd toolshed-0.4.0 python setup.py install echo "" echo "3b. Download / install bwa-meth" echo "==========================================" cd /src/zipped wget -q https://github.com/brentp/bwa-meth/archive/master.zip -O bwa-meth.zip cd /src unzip zipped/bwa-meth.zip cd bwa-meth-master cp -v bwameth.py /opt/bin ln -sv /opt/bin/bwameth.py /opt/bin/bwameth echo "" echo "4. Download / install samtools" echo "==============================" cd /src/zipped wget -q https://github.com/samtools/samtools/releases/download/1.23/samtools-1.23.tar.bz2 -O samtools-1.23.tar.bz2 cd /src tar -jxf zipped/samtools-1.23.tar.bz2 cd samtools-1.23 ./configure --prefix=/opt make -j$MAKE_JOBS make install strip -g /opt/bin/samtools echo "" echo "5a. Download / install sambamba - ldc" echo "=====================================" cd /src/zipped wget -q https://github.com/ldc-developers/ldc/releases/download/v1.42.0-beta3/ldc2-1.42.0-beta3-linux-x86_64.tar.xz -O ldc2-1.42.0-beta3-linux-x86_64.tar.xz cd /src tar -xf zipped/ldc2-1.42.0-beta3-linux-x86_64.tar.xz # Temporarily add ldc2 to the path PATH=/src/ldc2-1.42.0-beta3-linux-x86_64/bin:$PATH LIBRARY_PATH=/src/ldc2-1.42.0-beta3-linux-x86_64/lib echo "" echo "5b. Download / install sambamba" echo "===============================" cd /src/zipped wget -q https://github.com/biod/sambamba/archive/refs/heads/master.zip -O sambamba-master.zip cd /src unzip zipped/sambamba-master.zip cd sambamba-master CC=gcc-14 make release strip -g bin/sambamba-1.0.1 cp -v bin/sambamba-1.0.1 /opt/bin/sambamba echo "" echo "6. Download / install seqkit" echo "============================" cd /src/zipped wget -q https://github.com/shenwei356/seqkit/archive/refs/tags/v2.12.0.tar.gz -O seqkit-v2.12.0.tar.gz cd /src tar -zxf zipped/seqkit-v2.12.0.tar.gz cd seqkit-2.12.0/seqkit go build strip -g seqkit cp -v seqkit /opt/bin echo "" echo "6. Download / install methyldackel" echo "==================================" cd /src/zipped wget -q https://github.com/dpryan79/MethylDackel/archive/refs/tags/0.6.1.tar.gz -O methyldackel-0.6.1.tar.gz cd /src tar -zxf zipped/methyldackel-0.6.1.tar.gz cd MethylDackel-0.6.1/ make -j$MAKE_JOBS LIBBIGWIG=/lib/x86_64-linux-gnu/libBigWig.a strip -g MethylDackel cp -v MethylDackel /opt/bin ln -s /opt/bin/MethylDackel /opt/bin/methyldackel echo "" echo "7. Download / install minimap" echo "==================================" cd /src/zipped wget -q https://github.com/lh3/minimap2/archive/refs/tags/v2.30.tar.gz -O minimap2-v2.30.tar.gz cd /src tar -zxf zipped/minimap2-v2.30.tar.gz cd minimap2-2.30 cp Makefile Makefile.old # Strip out hardcoded CFLAGS to use our own cat Makefile.old | grep -v "^CFLAGS=" > Makefile make -j$MAKE_JOBS strip -g minimap2 cp -v minimap2 /opt/bin echo "" echo "8. Download / install bedtools2" echo "===============================" cd /src/zipped wget -q https://github.com/arq5x/bedtools2/archive/refs/heads/master.zip -O bedtools2-master.zip cd /src unzip zipped/bedtools2-master.zip cd bedtools2-master cp Makefile Makefile.old # Strip out hardcoded compiler name to use our own cat Makefile.old | sed 's/= g++/= g++-14/g' > Makefile make -j$MAKE_JOBS strip -g bin/bedtools cp -v bin/* /opt/bin echo "" echo "9. Install bam-readcount" echo "========================" cd /src/zipped wget -q https://github.com/genome/bam-readcount/archive/refs/heads/master.zip -O bam-readcount-master.zip cd /src unzip zipped/bam-readcount-master.zip cd bam-readcount-master mkdir build cd build cmake .. # This does not like parallel builds - it ends up out of sequence... make strip -g bin/bam-readcount cp -v bin/bam-readcount /opt/bin echo "" echo "10. Install hisat2" echo "==================" cd /src/zipped wget -q https://cloud.biohpc.swmed.edu/index.php/s/fE9QCsX3NH4QwBi/download -O hisat2-2.2.1.zip cd /src unzip zipped/hisat2-2.2.1.zip cd hisat2-2.2.1 cp Makefile Makefile.old # Strip out hardcoded compiler name to use our own cat Makefile.old | \ sed 's/CC = /CC = gcc-14 #/g' | \ sed 's/CPP = /CPP = g++-14 #/g' | \ sed 's/RELEASE_FLAGS =/RELEASE_FLAGS = $(CFLAGS) /g' > Makefile make -j$MAKE_JOBS strip -g hisat2-align-l strip -g hisat2-align-s strip -g hisat2-build-l strip -g hisat2-align-s strip -g hisat2-inspect-l strip -g hisat2-inspect-s strip -g hisat2-repeat cp -v hisat2 hisat2-align* hisat2-inspect* hisat2-repeat hisat2_*.py extract_*.py /opt/bin echo "" echo "11. Install stringtie" echo "======================" cd /src/zipped wget -q https://ccb.jhu.edu/software/stringtie/dl/stringtie-3.0.3.tar.gz -O stringtie-3.0.3.tar.gz cd /src tar -zxf zipped/stringtie-3.0.3.tar.gz cd stringtie-3.0.3/ make -j$MAKE_JOBS release strip -g stringtie cp -v stringtie /opt/bin cp -v prepDE.py3 /opt/bin/prepDE.py echo "" echo "12. Install gffcompare" echo "======================" cd /src/zipped wget -q https://ccb.jhu.edu/software/stringtie/dl/gffcompare-0.12.9.tar.gz -O gffcompare-0.12.9.tar.gz cd /src tar -zxf zipped/gffcompare-0.12.9.tar.gz cd gffcompare-0.12.9 make -j$MAKE_JOBS strip -g gffcompare strip -g trmap cp -v gffcompare /opt/bin cp -v trmap /opt/bin echo "" echo "13. Install htseq" echo "=================" pip3 install HTSeq --break-system-packages echo "" echo "14. Install picard" echo "==================" cd /src/zipped wget -q https://github.com/broadinstitute/picard/releases/download/3.4.0/picard.jar -O picard-3.4.0.jar cd /src cp -v zipped/picard-3.4.0.jar /opt/bin/picard.jar # We also set up a "java -jar picard.jar" helper alias # via an entry in the post-install %environment section echo "" echo "15a. Install flexbar - seqan" echo "============================" #cd /src/zipped #wget -q https://github.com/seqan/seqan/archive/refs/tags/seqan-v2.5.2.tar.gz -O seqan-v2.5.2.tar.gz #cd /src #tar -zxf zipped/seqan-v2.5.2.tar.gz #cd seqan-seqan-v2.5.2 #mkdir build #cd build #cmake .. #make -j$MAKE_JOBS #cd ../build/bin/ #/bin/ls | grep -v ^demo | grep -v ^test | while read B #do # strip -g $B # cp -v $B /opt/bin #done #echo "" #echo "15b. Install flexbar - Intel threading blocks" #echo "=============================================" #cd /src/zipped #wget -q https://github.com/uxlfoundation/oneTBB/archive/refs/tags/4.4.6.tar.gz -O tbb-4.4.6.tar.gz #cd /src #tar -zxf zipped/tbb-4.4.6.tar.gz #cd oneTBB-4.4.6 # Patch for GCC13+ # Found here: https://github.com/bambulab/BambuStudio/pull/1882/changes/d3459cb1b9f791531fe24b0558c581117243eade #cp include/tbb/task.h include/tbb/task.h.old #cat include/tbb/task.h.old | \ # sed 's/task\* next_offloaded\;/tbb\:\:task\* next_offloaded\;/g' > include/tbb/task.h # Mangle CXXFLAGS to allow compiling the old code against new GCC #CXXFLAGS="-O3 -march=znver4 -pipe -std=c++14" make #cp -v build/linux_intel64_gcc_cc13_libc2.39_kernel6.8.0_release/*.so /opt/lib #cp -v build/linux_intel64_gcc_cc13_libc2.39_kernel6.8.0_release/*.so.2 /opt/lib #cp -v -a include/tbb /opt/include # Reset CXXFLAGS back again #export CXXFLAGS="$CFLAGS" #echo "" #echo "15c. Install flexbar" #echo "====================" #cd /src/zipped #wget -q https://github.com/seqan/flexbar/archive/refs/tags/v3.5.0.tar.gz -O flexbar-v3.5.0.tar.gz #cd /src #tar -zxf zipped/flexbar-v3.5.0.tar.gz #cd flexbar-3.5.0 # Copy in the seqan 'library' - which is C++ code in header files... #cp -a /src/seqan-seqan-v2.5.2/include . #cmake . #make -j$MAKE_JOBS echo "" echo "16. Install regtools" echo "====================" cd /src/zipped wget -q https://github.com/griffithlab/regtools/archive/refs/tags/1.0.0.tar.gz -O regtools-1.0.0.tar.gz cd /src tar -zxf zipped/regtools-1.0.0.tar.gz cd regtools-1.0.0 mkdir build cd build cmake .. make -j$MAKE_JOBS strip -g regtools cp -v regtools /opt/bin echo "" echo "17. Install rseqc" echo "=================" cd /src/zipped wget -q https://sourceforge.net/projects/rseqc/files/RSeQC-5.0.1.tar.gz/download -O RSeQC-5.0.1.tar.gz cd /src # This tar file was created with AD/Domain user owner/group info # ignore it when extracting... tar --no-same-owner -zxf zipped/RSeQC-5.0.1.tar.gz cd RSeQC-5.0.1/ python setup.py install echo "" echo "18. Install bcftools" echo "=====================" cd /src/zipped wget -q https://github.com/samtools/bcftools/releases/download/1.23/bcftools-1.23.tar.bz2 -O bcftools-1.23.tar.bz2 cd /src tar -jxf zipped/bcftools-1.23.tar.bz2 cd bcftools-1.23 ./configure --prefix=/opt --enable-libgsl make make install echo "" echo "19. Install bamutil" echo "===================" cd /src/zipped wget -q https://github.com/statgen/bamUtil/archive/refs/tags/v1.0.15.tar.gz -O bamutil-1.0.15.tar.gz cd /src tar -zxf zipped/bamutil-1.0.15.tar.gz cd bamUtil-1.0.15 # Public git://github.com calls no longer work in 2026+ # Patch it out to https instead. cp Makefile.inc Makefile.inc.old cat Makefile.inc.old | sed 's/git clone git/git clone https/g' > Makefile.inc CFLAGS="$BASE_CFLAGS_ALT -I/opt/include" make cloneLib make make install INSTALLDIR=/opt/bin strip -g /opt/bin/bam CFLAGS="$BASE_CFLAGS -I/opt/include" echo "" echo "20. Install bowtie2" echo "===================" cd /src/zipped wget -q https://github.com/BenLangmead/bowtie2/archive/refs/tags/v2.5.5.tar.gz -O bowtie2-2.5.5.tar.gz cd /src tar -zxf zipped/bowtie2-2.5.5.tar.gz cd bowtie2-2.5.5 mkdir build cd build cmake .. make -j$MAKE_JOBS strip -g bowtie2-* cp -v bowtie2-* /opt/bin cd .. cp -v bowtie2 /opt/bin cp -v bowtie2-inspect /opt/bin cp -v bowtie2-build /opt/bin echo "" echo "21. Install bcl_convert" echo "=======================" cd /src if [ -s /mnt/bcl-convert-4.4.6-2.el8.x86_64.rpm ] then mkdir bcl-convert cd bcl-convert rpm2cpio /mnt/bcl-convert-4.4.6-2.el8.x86_64.rpm | cpio -idmv cp usr/bin/bcl-convert /opt/bin else echo "WARNING!!!! - Unable to find bcl_convert.rpm - this will be skipped" fi # Remove all src packages echo "" echo "Cleaning up downloaded src tree" echo "==================================" cd rm -rf /src pip3 cache purge echo "" echo "7. All done" %environment export PATH=/opt/bin:$PATH export LD_LIBRARY_PATH=/opt/lib:$LD_LIBRARY_PATH export CFLAGS="-O -I/opt/include" export CXXFLAGS="$CFLAGS" export CC=gcc-14 export CXX=g++-14 export FC=gfortran-14 export OMPI_CC=gcc-14 export MANPATH=/opt/man alias picard="java -jar /opt/bin/picard.jar" %runscript **Run file** You should ''source'' this file in order to use the ''container.run'' command. This should have the current container image name set as the ''IMAGE_NAME'' parameter: #!/bin/bash module load apptainer IMAGE_NAME=/nobackup/shared/containers/bioapps.2026.02.sif container.run() { # Run a command inside the container... # automatically bind the /scratch and /nobackup dirs # pass through any additional parameters given on the command line apptainer exec --bind /scratch:/scratch --bind /nobackup:/nobackup ${IMAGE_NAME} $@ } ---- [[:advanced:software|Back to Software]]