• Home
  • Accessing Our Facilities
    • Apply for Access
    • HPC Resource List
    • Our Staff
    • Our Research Projects
    • Our Research Software

    • Contributions & Costings
    • HPC Driving Test
  • Documentation
    • Documentation Home
    • Getting Started
    • Advanced Topics
    • Training & Workshops
    • FAQ
    • Policies & Procedures
    • Using the Wiki

    • Data & Report Terminology
    • About this website

    • Reports
  • My Account
    • My HPC Projects
HPC Support
Trace: • ldsc

ldsc

This software and user guide is still under development. It is not yet available on Comet.

ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.

The original version of ldsc was available from https://github.com/bulik/ldsc, this no longer works on any modern versions of Python. Do not attempt to use the older package.

The updated version of ldsc for Python 3.9+ is based on https://github.com/CBIIT/ldsc, but again, has not been updated very often.


Running ldsc on Comet

The ldsc tool is closely tied to older versions of Python (at least 3.9, but less than 3.12) and also requires a number of Python modules (numpy, scipy, matplotlib, etc). Rather than ask all users to install these custom versions, we have installed ldsc.py on Comet as a tiny, custom Apptainer container image.

The ldsc container is stored in the /nobackup/shared/containers directory and is accessible to all users of Comet. You do not need to take a copy of the container file; it should be left in its original location.

You can find the container files here:

  • /nobackup/shared/containers/ldsc.2026.03.sif

We normally recommend using the latest version of the container, in the case of Amber, the version numbers represent the date the container image was created (normally with the current version of the ldsc tool from Github at that time).

Container Image Versions

We may reference a specific container file, such as ldsc.2026.03, but you should always check whether this is the most recent version of the container available. Simply ls the /nobackup/shared/containers directory and you will be able to see if there are any newer versions listed.

We have provided a convenience script that will automate all of steps needed to run applications inside the container, as well as access your $HOME, /scratch and /nobackup directories to just two simple commands.

  • /nobackup/shared/containers/ldsc.2026.03.sh

There is a corresponding .sh script for each version of the container image we make available.

Just source this file and it will take care of loading apptainer, setting up your bind directories and calling the exec command for you - and give you a single command called container.run (instead of the really long apptainer exec command) to then run anything you want inside the container.


Simple use of ldsc

All of the ldsc commands are installed in the $PATH and can be called from inside the container without giving their full path or prefixing them with python.

The following ldsc commands are available:

  • ldsc
  • munge_sumstats
  • make_annot

As an example, to run ldsc, simply call it with the container.run helper as follows:

$ source /nobackup/shared/containers/ldsc.2026.03.sh
$ container.run ldsc -h
usage: ldsc.py [-h] [--out OUT] [--bfile BFILE] [--l2] [--extract EXTRACT] [--keep KEEP] [--ld-wind-snps LD_WIND_SNPS] [--ld-wind-kb LD_WIND_KB]
               [--ld-wind-cm LD_WIND_CM] [--print-snps PRINT_SNPS] [--annot ANNOT] [--thin-annot] [--cts-bin CTS_BIN] [--cts-breaks CTS_BREAKS]
               [--cts-names CTS_NAMES] [--per-allele] [--pq-exp PQ_EXP] [--no-print-annot] [--maf MAF] [--h2 H2] [--h2-cts H2_CTS] [--rg RG]
               [--ref-ld REF_LD] [--ref-ld-chr REF_LD_CHR] [--w-ld W_LD] [--w-ld-chr W_LD_CHR] [--overlap-annot] [--print-coefficients]
               [--frqfile FRQFILE] [--frqfile-chr FRQFILE_CHR] [--no-intercept] [--intercept-h2 INTERCEPT_H2] [--intercept-gencov INTERCEPT_GENCOV]
               [--M M] [--two-step TWO_STEP] [--chisq-max CHISQ_MAX] [--ref-ld-chr-cts REF_LD_CHR_CTS] [--print-all-cts] [--print-cov]
               [--print-delete-vals] [--chunk-size CHUNK_SIZE] [--pickle] [--yes-really] [--invert-anyway] [--n-blocks N_BLOCKS] [--not-M-5-50]
               [--return-silly-things] [--no-check-alleles] [--samp-prev SAMP_PREV] [--pop-prev POP_PREV]

options:
  -h, --help            show this help message and exit
...
$


Data used by ldsc

The sample data files used in the Basic Useage Example from the developers Github page have already been downloaded and are available in the shared data area of the /nobackup filesystem on Comet:

  • /nobackup/shared/data/ldsc
  • /nobackup/shared/data/ldsc/eas_ldscores/
  • /nobackup/shared/data/ldsc/sumstats/BBJ_HDLC22_sumstats.gz

If you have a data set used in ldsc that would be useful to others, please Contact us and we will arrange to have it moved to the shared data area, where you can continue to access it, share it with others, and have it excluded from the Data retention policies.


Running the ldsc example

You can follow the Basic Usage Example from the ldsc Github page, as follows:

$ source /nobackup/shared/containers/ldsc.2026.03.sh
$ container.run ldsc \
    --h2 /nobackup/shared/data/ldsc/sumstats/BBK_HDLC22.sumstats.gz \
    --ref-ld-chr /nobackup/shared/data/ldsc/eas_ldscores/ \
    --w-ld-chr /nobackup/shared/data/ldsc/eas_ldscores/

You should see the following output:

*********************************************************************
* LD Score Regression (LDSC)
* Version 3.0.1
* (C) 2014-2019 Brendan Bulik-Sullivan and Hilary Finucane
* Broad Institute of MIT and Harvard / MIT Department of Mathematics
* GNU General Public License v3
*********************************************************************
Call: 
./ldsc.py \
--h2 /nobackup/shared/data/ldsc/sumstats/BBJ_HDLC22.sumstats.gz \
--ref-ld-chr /nobackup/shared/data/ldsc/eas_ldscores/ \
--w-ld-chr /nobackup/shared/data/ldsc/eas_ldscores/ 

Beginning analysis at Wed Mar 25 12:00:59 2026
Reading summary statistics from /nobackup/shared/data/ldsc/sumstats/BBJ_HDLC22.sumstats.gz ...
RuntimeWarning: compression has no effect when passing a non-binary object as input for file /nobackup/shared/data/ldsc/sumstats/BBJ_HDLC22.sumstats.gz
Read summary statistics for 61663 SNPs.
Reading reference panel LD Score from /nobackup/shared/data/ldsc/eas_ldscores/[1-22] ... (ldscore_fromlist)
Read reference panel LD Scores for 1208050 SNPs.
Removing partitioned LD Scores with zero variance.
Reading regression weight LD Score from /nobackup/shared/data/ldsc/eas_ldscores/[1-22] ... (ldscore_fromlist)
Read regression weight LD Scores for 1208050 SNPs.
After merging with reference panel LD, 14193 SNPs remain.
After merging with regression SNP LD, 14193 SNPs remain.
WARNING: number of SNPs less than 200k; this is almost always bad.
Using two-step estimator with cutoff at 30.
Total Observed scale h2: 0.2747 (0.1014)
Lambda GC: 1.2103
Mean Chi^2: 1.2826
Intercept: 0.9211 (0.051)
Ratio < 0 (usually indicates GC correction).
Analysis finished at Wed Mar 25 12:01:01 2026
Total time elapsed: 2.12s

Note the following:

  • We used the /nobackup/shared/data/ldsc prefix for the already downloaded and processed data sets used in the example.
  • The directories given to the –ref-ld-chr and –w-ld-chr arguments must have the trailing slash (/) added; ldsc has a bug when missing that final slash.
  • If not explicitly set, the default will be to output the ldsc log/results file to your $HOME.

Accessing Data

As long as you use the container.run method to launch the applications, you will automatically be able to read and write to files in your $HOME, /scratch and /nobackup directories. This means that you can refer to the downloaded data sets under /nobackup/shared/data/ldsc if necessary.

If you run any of the applications inside the container manually, without using the container.run helper you will need to use the –bind argument to apptainer to ensure that all relevant directories are exposed within the container.

Do remember that the container filesystem itself cannot be changed - so you won't be able to write or update to /usr/local, /opt, /etc or any other internal folders - keep output directories restricted to the three areas listed above.


Building ldsc for Comet

Important!

This section is only for RSE HPC admin staff, or users who wish to understand how the ldsc container was built. If you are only interested in using ldsc, stop reading now.

Build script

#!/bin/bash
IMAGE_DATE=`date +%Y.%m`

echo "Loading modules..."
module load apptainer

echo ""
echo "Building container..."
export APPTAINER_TMPDIR=/scratch

echo ""
echo "Container will have date suffix $IMAGE_DATE"

SOURCE_DIR=`pwd`
apptainer build --bind $SOURCE_DIR:/mnt ldsc.$IMAGE_DATE.sif ldsc.def 2>&1 | tee ldsc.log

Container definition

Bootstrap: docker
From: ubuntu:jammy

####################################################################
#
# ldsc Container
# ==================
# This is a runtime environment for ldsc: https://github.com/cbiit/ldsc
# Please see: 
#	https://hpc.researchcomputing.ncl.ac.uk/dokuwiki/dokuwiki/doku.php?id=advanced:software:ldsc
#
####################################################################

%post
    # Prevent interactive prompts
    export DEBIAN_FRONTEND=noninteractive

####################################################################
#
# Basic system packages
#
####################################################################

	# Update & install only necessary packages
	apt-get update
	apt-get install -y aptitude wget unzip python3 python3-pip
	ln -sf /usr/bin/python3 /usr/bin/python
	
	# Clean up APT cache to save space
	apt-get clean 

	# Any Python modules installed via pip go here
	# pip install NAME --break-system-packages

	# Remove any Python cache files after pip
	pip cache purge

#################################################################################
#
# This is all the custom stuff needed to build the various ISSM tools
#
#################################################################################

	# Src and opt 
	mkdir -p /src/zipped
	mkdir -p /opt/bin
	mkdir -p /opt/data

	echo ""
	echo "INSTALL LDSC"
	echo "============"
	echo ""
	cd /src
	wget -q https://github.com/CBIIT/ldsc/archive/refs/heads/main.zip -O /src/zipped/ldsc.zip
	cd /opt
	unzip /src/zipped/ldsc.zip
	mv ldsc-main ldsc
	cd ldsc

	echo ""
	echo "INSTALL PYTHON MODULES"
	echo "======================"
	echo ""
	# Patch numpy version
	cp requirements.txt requirements.txt.old
	cat requirements.txt.old | grep -v ^numpy > requirements.txt
	echo "numpy==1.22.4" >> requirements.txt
	echo "matplotlib" >> requirements.txt

	# Install requirements
	pip install -r requirements.txt

	# Remove anything not needed to run
	rm -f dockerfile environment* setup.py requirements.txt	
	
	echo ""
	echo "DOWNLOAD REFERENCE DATA"
	echo "======================="
	# Download reference data
	wget -q https://ldlink.nih.gov/LDlinkRestWeb/copy_and_download/BBJ_HDLC22.txt -O /src/zipped/BBJ_HDLC22.txt
	python munge_sumstats.py --sumstats /src/zipped/BBJ_HDLC22.txt --out /opt/data/BBJ_HDLC22
	rm -f /opt/data/BBJ_HDLC22.log
	
	wget -q "https://drive.usercontent.google.com/u/0/uc?id=1BtpWx02ON33KfjyCFSdmoWYlMZWImh2f&export=download" -O /src/zipped/eas_ldscores.tar.bz2
	cd /opt/data
	tar -jxf /src/zipped/eas_ldscores.tar.bz2

	# Remove all src packages
	echo ""
	echo "FINAL CLEAN UP"
	echo "=============="
	echo ""
	cd /
	rm -rf /src
	pip cache purge

%environment
	export PATH=/opt/ldsc:$PATH

%runscript

Helper script


Back to software

Previous Next

HPC Support

Table of Contents

Table of Contents

  • ldsc
    • Running ldsc on Comet
    • Simple use of ldsc
    • Data used by ldsc
    • Running the ldsc example
    • Accessing Data
    • Building ldsc for Comet

HPC Service

  • News & Changes

Main Content Sections

  • Documentation Home
  • Getting Started
  • Advanced Topics
  • Training & Workshops
  • FAQ
  • Policies & Procedures
  • Using the Wiki
  • Contact us & Get Help

Documentation Tools

  • Wiki Login
  • RSE-HPC Team Area
Developed and operated by
Research Software Engineering
Copyright © Newcastle University
Contact us @rseteam