biohazard-0.6.2: bioinformatics support library

Safe HaskellNone




newtype BamKey Source

Exactly two characters, for the "named" fields in bam.


BamKey Word16 

data BamSQ Source




data BamSorting Source

Possible sorting orders from bam header. Thanks to samtools, which doesn't declare sorted files properly, we have to have the stupid Unknown state, too.

newtype Refseq Source

Reference sequence in Bam Bam enumerates the reference sequences and then sorts by index. We need to track that index if we want to reproduce the sorting order.




unRefseq :: Word32

invalidRefseq :: Refseq Source

The invalid Refseq. Bam uses this value to encode a missing reference sequence.

isValidRefseq :: Refseq -> Bool Source

Tests whether a reference sequence is valid. Returns true unless the the argument equals invalidRefseq.

invalidPos :: Int Source

The invalid position. Bam uses this value to encode a missing position.

isValidPos :: Int -> Bool Source

Tests whether a position is valid. Returns true unless the the argument equals invalidPos.

type Refs = Seq BamSQ Source

A list of reference sequences.

noRefs :: Refs Source

The empty list of references. Needed for BAM files that don't really store alignments.

compareNames :: Seqid -> Seqid -> Ordering Source

Compares two sequence names the way samtools does. samtools sorts by "strnum_cmp": . if both strings start with a digit, parse the initial sequence of digits and compare numerically, if equal, continue behind the numbers . else compare the first characters (possibly NUL), if equal continue behind them . else both strings ended and the shorter one counts as smaller (and that part is stupid)

distinctBin :: Int -> Int -> Int Source

Computes the "distinct bin" according to the BAM binning scheme. If an alignment starts at pos and its CIGAR implies a length of len on the reference, then it goes into bin distinctBin pos len.

showMd :: [MdOp] -> ByteString Source

Normalizes a series of MdOps and encodes them in the way BAM and SAM expect it.