====== Bolt-LMM ======
> The BOLT-LMM algorithm computes statistics for testing association between phenotype and genotypes using a linear mixed model
Available from:
* https://storage.googleapis.com/broad-alkesgroup-public/BOLT-LMM/BOLT-LMM_manual.html
===== Installing from source =====
Latest source code may be downloaded from: https://alkesgroup.broadinstitute.org/BOLT-LMM/downloads/ - this **also** includes a pre-built binary, but we have found it difficult to get this to run on Comet, as it appears to be linked against rather old versions of GLIBC.
Note that if choosing to compile BOLT-LMM from source, that it **will not** run link against the reference LAPACK 3.12.1 - that version of LAPACK is missing several symbols (see [[https://bbs.archlinux.org/viewtopic.php?id=302494|1]], and [[https://gitlab.archlinux.org/archlinux/packaging/packages/lapack/-/issues/3|2]]) which has not been fixed upstream. It __will__ compile and run against LAPACK 3.12.0.
We have been unable to get the binary distribution of BOLT-LMM v2.5 running on Comet - it will segfault at start regardless of what versions of GLIBC, NLOPT or BOOST are loaded.
To compile on Comet and get a working BOLT-LMM binary, edit the script below and change ''PATH_TO_LAPACK_LIBS'' with the real path to a working ''libblas.so'' & ''liblapack.so'' or ''liblapack.a'' & ''libblas.a''.
#!/bin/bash
echo ""
echo "Load modules ..."
echo "================"
module load GCC/13.3.0
module load NLopt/2.10.0
module load Boost/1.85.0-GCC-13.3.0
# Download source
echo ""
echo "Download BOLT ..."
echo "================="
wget -q https://storage.googleapis.com/broad-alkesgroup-public/BOLT-LMM/downloads/BOLT-LMM_v2.5.tar.gz -O BOLT-LMM_v2.5.tar.gz
# Unpack source archive
echo ""
echo "Untar ..."
echo "========="
rm -rf BOLT-LMM_v2.5
tar -zxf BOLT-LMM_v2.5.tar.gz
cd BOLT-LMM_v2.5/src
# Edit Makefile to use GCC and link in additional runtime libraries
echo ""
echo "Configure Makefile ..."
echo "======================"
cat Makefile | \
sed 's/CC\ \=\ icpc/#CC\ =\ icpc/g' | \
sed 's/# CC = g++/CC = g++/g' | \
sed 's/BOOST_INSTALL_DIR =/#BOOST_INSTALL_DIR =/g' | \
sed 's/NLOPT_INSTALL_DIR =/#NLOPT_INSTALL_DIR =/g' | \
sed 's/ZLIB_STATIC_DIR =/#ZLIB_STATIC_DIR =/g' | \
sed 's/LIBSTDCXX_STATIC_DIR =/#LIBSTDCXX_STATIC_DIR =/g' | \
sed 's/GLIBC_STATIC_DIR =/#GLIBC_STATIC_DIR =/g' | \
sed 's/ZSTD_DIR =/#ZSTD_DIR =/g' | \
sed 's/^CPATHS +=/#CPATHS +=/g' | \
sed 's/^LPATHS +=/#LPATHS +=/g' | \
sed 's/LLAPACK = -llapack -lgfortran/#LLAPACK = -llapack -lgfortran/g' | \
sed 's/$L/$L -lgfortran -L\/PATH\/TO\/LAPACK\/LIBS\/lib64 -llapack -lblas/g' > Makefile.linux
# Compile source using modified Makefile
echo ""
echo "Compile ..."
echo "==========="
make -f Makefile.linux clean
CFLAGS="-O2 -pipe -march=znver4 -msse4 -mavx2" make -f Makefile.linux -j4
There are a lot of unnecessary include and library paths in the Makefile - the script above comments all of those out so that the compiler searches //only// the system include/library paths (set by correctly loading any necessary modules) **and** your installation of LAPACK.
===== Testing =====
module load GCC/13.3.0
module load NLopt/2.10.0
module load Boost/1.85.0-GCC-13.3.0
Should show the BOLT-LMM useage information:
$ ./bolt -h
+-----------------------------+
| ___ |
| BOLT-LMM, v2.5 /_ / |
| June 21, 2025 /_/ |
| Po-Ru Loh // |
| / |
+-----------------------------+
Copyright (C) 2014-2025 Harvard University.
Distributed under the GNU GPLv3 open source license.
Compiled with USE_SSE: fast aligned memory access
Boost version: 1_85
Command line options:
./bolt -h
Typical options:
-h [ --help ] print help message with typical options
--helpFull print help message with full option list
--bfile arg prefix of PLINK .fam, .bim, .bed files
--bfilegz arg prefix of PLINK .fam.gz, .bim.gz, .bed.gz
files
--fam arg PLINK .fam file (note: file names ending in
.gz are auto-[de]compressed)
--bim arg PLINK .bim file(s); for >1, use multiple
--bim and/or {i:j}, e.g., data.chr{1:22}.bim
--bed arg PLINK .bed file(s); for >1, use multiple
--bim and/or {i:j} expansion
--geneticMapFile arg Oxford-format file for interpolating genetic
distances: tables/genetic_map_hg##.txt.gz
--remove arg file(s) listing individuals to ignore (no
header; FID IID must be first two columns)
--exclude arg file(s) listing SNPs to ignore (no header;
SNP ID must be first column)
--maxMissingPerSnp arg (=0.1) QC filter: max missing rate per SNP
--maxMissingPerIndiv arg (=0.1) QC filter: max missing rate per person
--phenoFile arg phenotype file (header required; FID IID must
be first two columns)
--phenoCol arg phenotype column header
--phenoUseFam use last (6th) column of .fam file as
phenotype
--covarFile arg covariate file (header required; FID IID must
be first two columns)
--covarCol arg categorical covariate column(s); for >1, use
multiple --covarCol and/or {i:j} expansion
--qCovarCol arg quantitative covariate column(s); for >1, use
multiple --qCovarCol and/or {i:j} expansion
--covarUseMissingIndic include samples with missing covariates in
analysis via missing indicator method
(default: ignore such samples)
--reml run variance components analysis to precisely
estimate heritability (but not compute assoc
stats)
--lmm compute assoc stats under the inf model and
with Bayesian non-inf prior (VB approx), if
power gain expected
--lmmInfOnly compute mixed model assoc stats under the
infinitesimal model
--lmmForceNonInf compute non-inf assoc stats even if BOLT-LMM
expects no power gain
--modelSnps arg file(s) listing SNPs to use in model (i.e.,
GRM) (default: use all non-excluded SNPs)
--LDscoresFile arg LD Scores for calibration of Bayesian assoc
stats: tables/LDSCORE.1000G_EUR.tab.gz
--numThreads arg (=1) number of computational threads
--statsFile arg output file for assoc stats at PLINK
genotypes
--dosageFile arg file(s) containing imputed SNP dosages to
test for association (see manual for format)
--dosageFidIidFile arg file listing FIDs and IIDs of samples in
dosageFile(s), one line per sample
--statsFileDosageSnps arg output file for assoc stats at dosage format
genotypes
--impute2FileList arg list of [chr file] pairs containing IMPUTE2
SNP probabilities to test for association
--impute2FidIidFile arg file listing FIDs and IIDs of samples in
IMPUTE2 files, one line per sample
--impute2MinMAF arg (=0) MAF threshold on IMPUTE2 genotypes; lower-MAF
SNPs will be ignored
--bgenFile arg file(s) containing Oxford BGEN-format
genotypes to test for association
--sampleFile arg file containing Oxford sample file
corresponding to BGEN file(s)
--bgenSampleFileList arg list of [bgen sample] file pairs containing
BGEN imputed variants to test for association
--bgenMinMAF arg (=0) MAF threshold on Oxford BGEN-format
genotypes; lower-MAF SNPs will be ignored
--bgenMinINFO arg (=0) INFO threshold on Oxford BGEN-format
genotypes; lower-INFO SNPs will be ignored
--bgenMinMAC arg (=1) minimum MAC threshold (in samples included in
association tests) on BGEN v1.2+ genotypes
--bgenVariantsToTest arg list of bgen variants to test (CHR POS REF
ALT)
--bgenRefFirst set effect allele (ALLELE1) to second allele
in BGEN v1.2+ genotype file
--statsFileBgenSnps arg output file for assoc stats at BGEN-format
genotypes
--statsFileImpute2Snps arg output file for assoc stats at IMPUTE2 format
genotypes
--dosage2FileList arg list of [map dosage] file pairs with 2-dosage
SNP probabilities (Ricopili/plink2 --dosage
format=2) to test for association
--statsFileDosage2Snps arg output file for assoc stats at 2-dosage
format genotypes
$
----
[[:advanced:software|Back to advanced software section]]