Executing scAVENGERS

Running the whole pipeline

Execution

The whole pipeline is executed by running the command below.

conda activate scavengers
$scAVENGERS_DIRECTORY/scAVENGERS pipeline --configfile config.yaml -j $THREADS

Preparing data

Alignment file in bam format and line-seperated list of barcodes in text file are required to run scAVENGERS.

If you are using products from 10X Genomics, you can acquire the alignment file and barcodes in ease by using Cellranger ATAC count (https://github.com/10XGenomics/cellranger-atac). The tool aligns fastq files by using bwa mem-like algorithm.

  • The alignments are stored in outs/possorted_bam.bam

  • The barcodes are stored in outs/filtered_peak_bc_matrix/barcodes.tsv.

Configuring

The configuration file is provided in config.yaml in scAVENGERS repository.

Running the demultiplexing module

The demultiplexing module scAVENGERS cluster is executed by running the command below.

conda activate scavengers
$scAVENGERS_DIRECTORY/scAVENGERS cluster -a alt.mtx -r ref.mtx -b barcodes.txt -o clusters.tsv

Preparing data

scAVENGERS cluster requires three inputs: count matrices for alternative and reference alleles in .mtx format, and a line-seperated list of barcodes in text file.

You can acquire allele count matrices by using VarTrix (https://github.com/10XGenomics/vartrix). Below is a minimal command to get reference and alternative allele counts.

vartrix \
-b alignment.bam \
-v variants.vcf.gz \
--fasta reference.fa \
-c barcodes.txt \
--scoring-method coverage \
--out-matrix {output.alt} --ref-matrix {output.ref}