Our Research Projects

BSU Pilot

This is a historic project which has previously made use of HPC facilities at Newcastle University.

Project Contacts

For further information about this project, please contact:


Project Description

Application-based processing of large datasets.


Software or Compute Methods

Previously discussed. Pasted below:

Samtools (along with HTSlib and BCFtools): http://www.htslib.org

BWA: https://github.com/lh3/bwa

TEBreak: https://github.com/adamewing/tebreak
- This is a python library with a number of dependencies. On our cluster I have installed them as Environment Modules in their own right, as they will often have utility for other users in their own right.

With respect to using TEBreak, it would also be good to have the most recent version of Python 2.

TEBreak dependencies: BWA and Samtools, as above, also:
LAST: http://last.cbrc.jp/
Minia: http://minia.genouest.org/
Exonerate: https://github.com/adamewing/exonerate.git

Those tools will let me run the core workflow I’m after, but it would also be good to do some generic variant calling on these samples, so for that:

GATK: https://software.broadinstitute.org/gatk/ (currently 3.7-0)
Picard Tools: https://broadinstitute.github.io/picard/

Both of these need Java 8, preferably the most recent release. GATK doesn’t support OpenJDK, so it needs to be Oracle.