biohazard-1.0.0: bioinformatics support library

Safe HaskellNone
LanguageHaskell2010

Bio.Bam.Header

Synopsis

Documentation

newtype BamKey Source #

Exactly two characters, for the "named" fields in bam.

Constructors

BamKey Word16 

data BamSQ Source #

Constructors

BamSQ 

Instances

Eq BamSQ Source # 

Methods

(==) :: BamSQ -> BamSQ -> Bool #

(/=) :: BamSQ -> BamSQ -> Bool #

Show BamSQ Source # 

Methods

showsPrec :: Int -> BamSQ -> ShowS #

show :: BamSQ -> String #

showList :: [BamSQ] -> ShowS #

data BamSorting Source #

Possible sorting orders from bam header. Thanks to samtools, which doesn't declare sorted files properly, we have to have the stupid Unknown state, too.

newtype Refseq Source #

Reference sequence in Bam Bam enumerates the reference sequences and then sorts by index. We need to track that index if we want to reproduce the sorting order.

Constructors

Refseq 

Fields

invalidRefseq :: Refseq Source #

The invalid Refseq. Bam uses this value to encode a missing reference sequence.

isValidRefseq :: Refseq -> Bool Source #

Tests whether a reference sequence is valid. Returns true unless the the argument equals invalidRefseq.

invalidPos :: Int Source #

The invalid position. Bam uses this value to encode a missing position.

isValidPos :: Int -> Bool Source #

Tests whether a position is valid. Returns true unless the the argument equals invalidPos.

type Refs = Seq BamSQ Source #

A list of reference sequences.

noRefs :: Refs Source #

The empty list of references. Needed for BAM files that don't really store alignments.

compareNames :: Seqid -> Seqid -> Ordering Source #

Compares two sequence names the way samtools does. samtools sorts by "strnum_cmp":

  • if both strings start with a digit, parse the initial sequence of digits and compare numerically, if equal, continue behind the numbers
  • else compare the first characters (possibly NUL), if equal continue behind them
  • else both strings ended and the shorter one counts as smaller (and that part is stupid)

distinctBin :: Int -> Int -> Int Source #

Computes the "distinct bin" according to the BAM binning scheme. If an alignment starts at pos and its CIGAR implies a length of len on the reference, then it goes into bin distinctBin pos len.

data MdOp Source #

Instances

showMd :: [MdOp] -> Bytes Source #

Normalizes a series of MdOps and encodes them in the way BAM and SAM expect it.