Skip to content

Generate a simple consensus using % identity thresholds and minimum coverage from a SAM/BAM file, treating each read group (RG) separately.

License

Notifications You must be signed in to change notification settings

shengxingou/simple-consensus-per-read-group

 
 

Repository files navigation

usage: simple-consensus-per-read-group [-h] [-t x] [-d n] [--max-depth n] [-a]
                                       [--ignore-read-groups]
                                       [--missing-quality q]
                                       input.bam

Constructs a simple (dumb) consensus sequence for each read group
(RG) in a BAM file.

Base qualities are averaged for each site.  Percent identity
thresholding is done to call a consensus base.  If no base reaches
the threshold, an N is called.

Consensus sequences span only the reference bases between the first
and last covered base.  Multiple reference sequences (such as
chromosomes or contigs) will get separate consensus sequences.

Input must be a BAM file indexed with `samtools index`.  Output is
FastQ.

positional arguments:
  input.bam             SAM/BAM file of aligned reads

optional arguments:
  -h, --help            show this help message and exit
  -t x, --threshold x   Frequency threshold below which a base is not called
                        (default: 0.7)
  -d n, --min-depth n   Read depth below which a base is not called (default:
                        1)
  --max-depth n         Maximum number of reads considered for a single
                        consensus position (default: 8000)
  -a, --ambiguous       Call bases below threshold with ambiguity codes
                        (default: False)
  --ignore-read-groups  Ignore read groups and lump all reads into one
                        consensus (default: False)
  --missing-quality q   Quality score to use for bases missing a quality score
                        (default: 0)

version 1.2.2
Copyright 2015–2018 by Thomas Sibley <[email protected]>
Mullins Lab <https://mullinslab.microbiol.washington.edu>
Department of Microbiology
University of Washington

About

Generate a simple consensus using % identity thresholds and minimum coverage from a SAM/BAM file, treating each read group (RG) separately.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 82.6%
  • Perl 14.1%
  • Makefile 3.3%