Newcastle University HPC Portal

Our Research Projects

Applying Machine Learning to Extract Insights from IVCM Imaging in LSCT Patients

This is a project which is currently making use of HPC facilities at Newcastle University. It is active.

Project Contacts

For further information about this project, please contact:

Professor Anvar Shukurov (anvar.shukurov@newcastle.ac.uk)
Dr Laura Wadkin (laura.wadkin@newcastle.ac.uk)

Project Description

This PhD research proposes to address these critical gaps by employing machine learning (ML) and deep learning (DL) techniques to develop automated tools for analyzing IVCM images from LSCT patients. The overall goal is to transform complex image data into practical, actionable insights for clinical use, thereby supporting clinical decision-making, personalizing treatment plans, and ultimately improving patient outcomes.

Software or Compute Methods

Software & Compute:

1. Software Stack & Frameworks

The project will leverage a Python-based data science and deep learning ecosystem. Key libraries include:

- Deep Learning Frameworks: PyTorch and TensorFlow will serve as the primary engines for training and inference of segmentation models.

- Medical Imaging Toolkits: MONAI (Medical Open Network for AI) will be used for domain-specific transformations and specialized loss functions for 3D/2D medical imaging.

- Segmentation Algorithms: Implementation of StarDist for star-convex object detection, Cellpose for generalized biological cell segmentation, and custom U-Net architectures.

- Image Processing: OpenCV, Scikit-Image, and SimpleITK for preprocessing (normalization, noise reduction) and classical methods like SLIC Superpixels and Watershed.

2. Processing Methods

The workflow is divided into three computationally intensive stages:

- Preprocessing & Augmentation: Large-scale batch processing of high-resolution IVCM images, involving geometric transformations and intensity normalization to improve model generalization.

- Model Training (GPU Accelerated): We will utilize deep learning models to segment cell boundaries and nuclei. This requires GPU acceleration (CUDA) to handle the iterative backpropagation of gradients across large datasets (incorporating images from all 23 patients).

- Inference & Validation: The optimized models will perform automated segmentation on the full longitudinal dataset. We will conduct comparative analysis across different architectures (e.g., comparing U-Net vs. Cellpose performance) to identify the most robust biomarkers for LSCD recovery.

3. Compute Requirements

- Parallelization: We will utilize Slurm array jobs to process multiple patient image sets concurrently, significantly reducing total wall-time.

- Memory & Storage: High-memory nodes (8GB+ per task) are required to handle high-bit-depth IVCM image stacks in memory during training.

- Acceleration: Access to NVIDIA GPU nodes is essential for the StarDist, Cellpose, and MONAI-based training phases, which are computationally infeasible on standard CPUs.