biohazard-0.6.15: bioinformatics support library

Safe HaskellNone
LanguageHaskell2010

Bio.Bam.Rec

Description

Parsers and Printers for BAM and SAM. We employ an Iteratee interface, and we strive to support everything possible in BAM. So far, the implementation of the nucleotides is somewhat lacking: we do not have support for ambiguity codes, and the "=" symbol is not understood.

Synopsis

Documentation

data BamRaw Source

Bam record in its native encoding along with virtual address.

bamRaw :: FileOffset -> Bytes -> BamRaw Source

Smart constructor. Makes sure we got a at least a full record.

data BamRec Source

internal representation of a BAM record

Constructors

BamRec 

Fields

b_qname :: Seqid
 
b_flag :: Int
 
b_rname :: Refseq
 
b_pos :: Int
 
b_mapq :: Qual
 
b_cigar :: Vector Cigar
 
b_mrnm :: Refseq
 
b_mpos :: Int
 
b_isize :: Int
 
b_seq :: Vector_Nucs_half Nucleotides
 
b_qual :: Vector Qual
 
b_exts :: Extensions
 
b_virtual_offset :: FileOffset

virtual offset for indexing purposes

getMd :: BamRec -> Maybe [MdOp] Source

data Cigar Source

Cigar line in BAM coding Bam encodes an operation and a length into a single integer, we keep those integers in an array.

Constructors

!CigOp :* !Int infix 9 

Instances

Eq Cigar Source 
Ord Cigar Source 
Show Cigar Source 
Storable Cigar Source 

data CigOp Source

Constructors

Mat 
Ins 
Del 
Nop 
SMa 
HMa 
Pad 

Instances

Bounded CigOp Source 
Enum CigOp Source 
Eq CigOp Source 
Ord CigOp Source 
Show CigOp Source 
Ix CigOp Source 

alignedLength :: Vector v Cigar => v Cigar -> Int Source

extracts the aligned length from a cigar line This gives the length of an alignment as measured on the reference, which is different from the length on the query or the length of the alignment.

newtype Nucleotides Source

A nucleotide base in an alignment. Experience says we're dealing with Ns and gaps all the type, so purity be damned, they are included as if they were real bases.

To allow Nucleotidess to be unpacked and incorporated into containers, we choose to represent them the same way as the BAM file format: as a 4 bit wide field. Gaps are encoded as 0 where they make sense, N is 15. The contained Word8 is guaranteed to be 0..15.

Constructors

Ns 

Fields

unNs :: Word8
 

data Vector_Nucs_half a Source

A vector that packs two Nucleotides into one byte, just like Bam does.

type Extensions = [(BamKey, Ext)] Source

A collection of extension fields. The key is actually only two Chars, but that proved impractical. (Hmm... we could introduce a Key type that is a 16 bit int, then give it an instance IsString... practical?)

data Ext Source

Constructors

Int Int 
Float Float 
Text Bytes 
Bin Bytes 
Char Word8 
IntArr (Vector Int) 
FloatArr (Vector Float) 

Instances

Eq Ext Source 
Ord Ext Source 
Show Ext Source 

extAsInt :: Int -> BamKey -> BamRec -> Int Source

deleteE :: BamKey -> Extensions -> Extensions Source

Deletes all occurences of some extension field.

insertE :: BamKey -> Ext -> Extensions -> Extensions Source

Blindly inserts an extension field. This can create duplicates (and there is no telling how other tools react to that).

updateE :: BamKey -> Ext -> Extensions -> Extensions Source

Deletes all occurences of an extension field, then inserts it with a new value. This is safer than insertE, but also more expensive.

adjustE :: (Ext -> Ext) -> BamKey -> Extensions -> Extensions Source

Adjusts a named extension by applying a function.

progressBam :: MonadIO m => String -> Refs -> Int -> (String -> IO ()) -> Enumeratee [BamRaw] [BamRaw] m a Source

A simple progress indicator that prints sequence id and position.

data Word32 :: *

Instances

Bounded Word32 
Enum Word32 
Eq Word32 
Integral Word32 
Data Word32 
Num Word32 
Ord Word32 
Read Word32 
Real Word32 
Show Word32 
Ix Word32 
Storable Word32 
Bits Word32 
FiniteBits Word32 
Unbox Word32 
Prim Word32 
Lift Word32 
Hashable Word32 
Random Word32 
MVector MVector Word32 
Vector Vector Word32 
data Vector Word32 = V_Word32 (Vector Word32) 
data MVector s Word32 = MV_Word32 (MVector s Word32)