biohazard-0.6.2: bioinformatics support library

Safe HaskellNone
LanguageHaskell98

Bio.Bam.Reader

Synopsis

Documentation

data Block Source

One BGZF block: virtual offset and contents. Could also be a block of an uncompressed file, if we want to support indexing of uncompressed BAM or some silliness like that.

Constructors

Block 

decompressBgzf :: MonadIO m => Enumeratee ByteString ByteString m a Source

Decompress a BGZF stream into a stream of ByteStrings.

compressBgzf :: MonadIO m => Enumeratee BgzfChunk ByteString m a Source

Like compressBgzf', with sensible defaults.

decodeBam :: Monad m => (BamMeta -> Iteratee [BamRaw] m a) -> Iteratee Block m (Iteratee [BamRaw] m a) Source

Decode a BAM stream into raw entries. Note that the entries can be unpacked using decodeBamEntry. Also note that this is an Enumeratee in spirit, only the BamMeta and Refs need to get passed separately.

decodeAnyBam :: MonadIO m => BamrawEnumeratee m a Source

Checks if a file contains BAM in any of the common forms, then decompresses it appropriately. We support plain BAM, Bgzf'd BAM, and Gzip'ed BAM.

The recommendation for these functions is to use decodeAnyBam (or decodeAnyBamFile) for any code that can handle BamRaw input, but decodeAnyBamOrSam (or decodeAnyBamOrSamFile) for code that needs BamRec. That way, SAM is supported automatically, and seeking will be supported if possible.

type BamEnumeratee m b = Enumeratee' BamMeta ByteString [BamRec] m b Source

isBamOrSam :: MonadIO m => Iteratee ByteString m (BamEnumeratee m a) Source

isBam :: MonadIO m => Iteratee ByteString m (Maybe (BamrawEnumeratee m a)) Source

Tests if a data stream is a Bam file. Recognizes plain Bam, gzipped Bam and bgzf'd Bam. If a file is recognized as Bam, a decoder (suitable Enumeratee) for it is returned. This uses iLookAhead internally, so it shouldn't consume anything from the stream.

isPlainBam :: MonadIO m => Iteratee ByteString m (Maybe (BamrawEnumeratee m a)) Source

Tests if a data stream is a Bam file. Recognizes plain Bam, gzipped Bam and bgzf'd Bam. If a file is recognized as Bam, a decoder (suitable Enumeratee) for it is returned. This uses iLookAhead internally, so it shouldn't consume anything from the stream.

isGzipBam :: MonadIO m => Iteratee ByteString m (Maybe (BamrawEnumeratee m a)) Source

Tests if a data stream is a Bam file. Recognizes plain Bam, gzipped Bam and bgzf'd Bam. If a file is recognized as Bam, a decoder (suitable Enumeratee) for it is returned. This uses iLookAhead internally, so it shouldn't consume anything from the stream.

isBgzfBam :: MonadIO m => Iteratee ByteString m (Maybe (BamrawEnumeratee m a)) Source

Tests if a data stream is a Bam file. Recognizes plain Bam, gzipped Bam and bgzf'd Bam. If a file is recognized as Bam, a decoder (suitable Enumeratee) for it is returned. This uses iLookAhead internally, so it shouldn't consume anything from the stream.

decodeSam :: Monad m => (BamMeta -> Iteratee [BamRec] m a) -> Iteratee ByteString m (Iteratee [BamRec] m a) Source

Iteratee-style parser for SAM files, designed to be compatible with the BAM parsers. Parses plain uncompressed SAM, nothing else. Since it is supposed to work the same way as the BAM parser, it requires the presense of the SQ header lines. These are stripped from the header text and turned into the symbol table.

decodeSam' :: Monad m => Refs -> Enumeratee ByteString [BamRec] m a Source

Parser for SAM that doesn't look for a header. Has the advantage that it doesn't stall on a pipe that never delivers data. Has the disadvantage that it never reads the header and therefore needs a list of allowed RNAMEs.

decodeAnyBamOrSam :: MonadIO m => BamEnumeratee m a Source

Checks if a file contains BAM in any of the common forms, then decompresses it appropriately. If the stream doesn't contain BAM at all, it is instead decoded as SAM. Since SAM is next to impossible to recognize reliably, we don't even try. Any old junk is decoded as SAM and will fail later.