Single Cell RNA Intron-Exon Counting
scrinvex counts intronic, exonic, and junction-spanning reads for each unique barcode encountered in the input bam. Each mapped read is checked against the input gtf to determine if the read lies entirely on introns, exons, or crosses at least one intron/exon junction. Reads with the same UMI are only checked against any given gene once. Subsequent reads with the same UMI will not be checked against any gene that the first read intersected.
$ git clone --recursive https://github.com/getzlab/scrinvex.git scrinvex
$ cd scrinvex
Unfortunately the current version of the scrinvex code (dated May 7, 2020) has a few errors and won't compile cleanly. The header file rle.h
, which can be found in these two locations:
rnaseqc/SeqLib/bwa/rle.h
rnaseqc/SeqLib/fermi-lite/rle.h
Needs to be amended to remove the duplicate definition of an array data structure (rle_auxtab[8]
). You have two options available to fix this:
You can open up both of those header files in a text editor (vim, nano) and comment out line 33:
const uint8_t rle_auxtab[8];
Comment it out with two forward slashes:
//const uint8_t rle_auxtab[8];
For more advanced users, a pair of patch files are available below, and can be applied using patch -p1 < bwa_rle_patch.diff
and 'patch -p1 < fermilite_rle.diff
:
--- rnaseqc/SeqLib/bwa/rle.h 2025-03-19 10:38:56.280816032 +0000
+++ rnaseqc/SeqLib/bwa/rle.h.new 2025-03-19 10:38:49.667778873 +0000
@@ -30,7 +30,7 @@
*** 43+3 codec ***
******************/
-const uint8_t rle_auxtab[8];
+//const uint8_t rle_auxtab[8];
#define RLE_MIN_SPACE 18
#define rle_nptr(block) ((uint16_t*)(block))
--- rnaseqc/SeqLib/fermi-lite/rle.h 2025-03-19 10:42:14.703929146 +0000
+++ rnaseqc/SeqLib/fermi-lite/rle.h.new 2025-03-19 10:34:51.323440201 +0000
@@ -30,7 +30,7 @@
*** 43+3 codec ***
******************/
-const uint8_t rle_auxtab[8];
+//const uint8_t rle_auxtab[8];
#define RLE_MIN_SPACE 18
#define rle_nptr(block) ((uint16_t*)(block))
After editing both of the rle.h
header files, you can complete the installation:
$ module load GCC
$ module load Boost
$ cd scrinvex
$ make
...
...
...
...
$ ls -l scrinvex
-rwx------ 1 n12345 rocketloginaccess 4011608 Mar 19 10:38 scrinvex
$ file scrinvex
scrinvex: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, not stripped
The scrinvex
binary can be copied anywhere you like and run normally.
Note, if you do not load the Boost
module, then you will see an error similar to this when attempting to compile:
$ make
g++ -Wall -std=c++14 -D_GLIBCXX_USE_CXX11_ABI=1 -O3 -I. -Irnaseqc -Irnaseqc/src -Irnaseqc/SeqLib -Irnaseqc/SeqLib/htslib/ -c -o src/scrinvex.o src/scrinvex.cpp
src/scrinvex.cpp:13:10: fatal error: boost/filesystem.hpp: No such file or directory
13 | #include <boost/filesystem.hpp>
| ^~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [src/scrinvex.o] Error 1
$
Make sure that you have loaded both GCC
and Boost
before running make
.
If you do not edit the rle.h
header files, you will see an error similar to this when attempting to compile:
$ make
gcc -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS bwashm.o bwase.o bwaseqio.o bwtgap.o bwtaln.o bamlite.o bwape.o kopen.o pemerge.o maxk.o bwtsw2_core.o bwtsw2_main.o bwtsw2_aux.o bwt_lite.o bwtsw2_chain.o fastmap.o bwtsw2_pair.o main.o -o bwa -L. -lbwa -lm -lz -lpthread -lrt
/mnt/storage/apps/eb/software/binutils/2.40-GCCcore-12.3.0/bin/ld: ./libbwa.a(rope.o):scrinvex/rnaseqc/SeqLib/bwa/rle.h:33: multiple definition of `rle_auxtab'; ./libbwa.a(bwtindex.o):scrinvex/rnaseqc/SeqLib/bwa/rle.h:33: first defined here
/mnt/storage/apps/eb/software/binutils/2.40-GCCcore-12.3.0/bin/ld: ./libbwa.a(rle.o):scrinvex/rnaseqc/SeqLib/bwa/rle.h:33: multiple definition of `rle_auxtab'; ./libbwa.a(bwtindex.o):scrinvex/rnaseqc/SeqLib/bwa/rle.h:33: first defined here
collect2: error: ld returned 1 exit status
make[3]: *** [bwa] Error 1
make[3]: Leaving directory `scrinvex/rnaseqc/SeqLib/bwa'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `scrinvex/rnaseqc/SeqLib'
make[1]: *** [all] Error 2
make[1]: Leaving directory `scrinvex/rnaseqc/SeqLib'
make: *** [rnaseqc/SeqLib/lib/libseqlib.a] Error 2
$
Make sure you have edited rle.h
as per Option 1 or Option 2, as detailed above.
You always need to module load Boost
and module load GCC
before running scrinvex in order to have the right C and Boost runtime libraries loaded, but other than that, it appears to have no other special requirements.
$ module load GCC
$ module load Boost
$ ./scrinvex --help
./scrinvex [gtf] [bam] {OPTIONS}
SCRINVEX - A Single Cell RNA-Seq QC tool
OPTIONS:
-h, --help Display this message and quit
gtf The input GTF file containing features
to check the bam against
bam The input SAM/BAM file containing reads
to process
-o[ouput], --output=[ouput] Path to output file. Default: {current
directory}/{bam filename}.scrinvex.tsv
-b[barcodes],
--barcodes=[barcodes] Path to filtered barcodes.tsv file from
cellranger. Only barcodes listed in the
file will be used. Default: All barcodes
present in bam
-q[quality], --quality=[quality] Set the lower bound on read quality for
coverage counting. Reads below this
quality are skipped. Default: 255
-s[path], --summary=[path] Produce a summary of counts by barcode
in a separate file. This includes a
count of intergenic reads. If the flag
is provided with no arguments, this
defaults to {current directory}/{bam
filename}.scrinvex.summary.tsv. You may
provide a different path as an argument
to this flag
"--" can be used to terminate flag options and force all following
arguments to be treated as positional options
$