Our Research Projects

KnockoffGWAS Framework

This is a project which is currently making use of HPC facilities at Newcastle University. It is active.

Project Contacts

For further information about this project, please contact:


Project Description

This project aims to build a user-friendly, end-to-end knockoffGWAS pipeline and apply it to large-scale Primary Biliary Cholangitis (PBC) datasets. We are developing scripts to use existing methods to generate high-quality knockoff genotypes, run knockoff-based association tests, and streamline quality control, modelling, and reporting. The pipeline is designed to make rigorous false-discovery-rate–controlled analyses accessible to researchers working with complex, high-dimensional genetic data. By applying this framework to PBC, the project seeks to identify robust, likely causal genetic loci and provide clearer insight into the biological mechanisms driving disease susceptibility.


Software or Compute Methods

The workflow is built primarily in R, making use of key packages such as tidyverse, fastcluster, latex2exp, grid, gridExtra, BiocManager, bigstatsr, and bigsnpr. These are integrated with a C++-compiled knockoff generator and additional genetics tools including PLINK, RaPID, and BCFtools. The pipeline is organized as a series of modular, chromosome-level jobs that can be run in parallel through the cluster’s scheduling system. Compute-intensive steps—such as genotype quality control, knockoff generation, and statistical modeling—are executed on high-memory, multi-core nodes. All intermediate outputs are written to the cluster’s high-throughput shared filesystem to support efficient job coordination and downstream analysis.