This is a project which is currently making use of HPC facilities at Newcastle University. It is active.
For further information about this project, please contact:
Project Description
This project uses high-performance computing to analyse transcriptomic datasets generated from human induced pluripotent stem cell-derived cardiomyocytes and cardiac fibroblasts exposed to anthracycline-associated stress. Bulk RNA sequencing data are processed and analysed to identify differential gene expression, pathway enrichment, and coordinated biological programmes related to cellular senescence, mitochondrial dysfunction, innate immune signalling, and extracellular matrix remodelling.
Computational workflows include quality control, alignment and quantification of RNA-seq data, differential expression analysis, functional enrichment using Gene Ontology and pathway databases, and integrative comparisons between cardiomyocyte and fibroblast datasets to interrogate paracrine signalling mechanisms. The project aims to define durable transcriptional responses to transient injury and their potential contribution to maladaptive myocardial remodelling.
HPC resources are required to ensure reproducible, scalable, and efficient processing of multi-sample transcriptomic datasets and downstream bioinformatic analyses.
This project will use established open-source bioinformatics software and scripting languages to process and analyse bulk RNA sequencing datasets. Core tools include standard RNA-seq quality control, alignment, and quantification software, followed by statistical analysis and visualisation in R and Python. Differential gene expression analysis will be performed using widely adopted statistical frameworks, with downstream functional enrichment using Gene Ontology and pathway databases such as KEGG and Reactome.
Analyses will be executed using batch and interactive jobs on the HPC facility, combining multi-core CPU processing and moderate memory workloads for alignment, quantification, and matrix-based statistical analyses. Workflow execution will be managed using shell scripting and job scheduling via Slurm to ensure reproducibility and efficient resource utilisation. Version-controlled scripts and modular pipelines will be used to enable transparent re-analysis and future extension of the datasets.
The HPC facility is required to support parallel processing of multiple samples, large intermediate files generated during transcriptomic analysis, and reproducible execution of computational workflows.