Safe Haskell | None |
---|---|
Language | Haskell2010 |
Buffer builder to assemble Bgzf blocks. The idea is to serialize stuff (BAM and BCF) into a buffer, then bgzf chunks from the buffer. We use a large buffer, and we always make sure there is plenty of space in it (to avoid redundant checks).
Synopsis
- data BB = BB {}
- newBuffer :: Int -> IO BB
- fillBuffer :: BB -> BgzfTokens -> IO (BB, BgzfTokens)
- expandBuffer :: Int -> BB -> IO BB
- encodeBgzf :: MonadIO m => Int -> Enumeratee (Endo BgzfTokens) ByteString m b
- data BgzfTokens
- = TkWord32 !Word32 BgzfTokens
- | TkWord16 !Word16 BgzfTokens
- | TkWord8 !Word8 BgzfTokens
- | TkFloat !Float BgzfTokens
- | TkDouble !Double BgzfTokens
- | TkString !ByteString BgzfTokens
- | TkDecimal !Int BgzfTokens
- | TkSetMark BgzfTokens
- | TkEndRecord BgzfTokens
- | TkEndRecordPart1 BgzfTokens
- | TkEndRecordPart2 BgzfTokens
- | TkEnd
- | TkBclSpecial !BclArgs BgzfTokens
- | TkLowLevel !Int (BB -> IO BB) BgzfTokens
- data BclArgs = BclArgs BclSpecialType !(Vector Word8) !Int !Int !Int !Int
- data BclSpecialType
- int_loop :: Ptr Word8 -> Int -> IO Int
- loop_bcl_special :: Ptr Word8 -> BclArgs -> IO Int
Documentation
We manage a large buffer (multiple megabytes), of which we fill an
initial portion. We remember the size, the used part, and two marks
where we later fill in sizes for the length prefixed BAM or BCF
records. We move the buffer down when we yield a piece downstream,
and when we run out of space, we simply move to a new buffer.
Garbage collection should take care of the rest. Unused mark
must
be set to (maxBound::Int) so it doesn't interfere with flushing.
fillBuffer :: BB -> BgzfTokens -> IO (BB, BgzfTokens) Source #
expandBuffer :: Int -> BB -> IO BB Source #
Creates a new buffer, copying the active content from an old one,
with higher capacity. The size of the new buffer is twice the free
space in the old buffer, but at least minsz
.
encodeBgzf :: MonadIO m => Int -> Enumeratee (Endo BgzfTokens) ByteString m b Source #
Expand a chain of tokens into a buffer, sending finished pieces downstream as soon as possible.
data BgzfTokens Source #
Things we are able to encode. Taking inspiration from binary-serialise-cbor, we define these as a lazy list-like thing and consume it in a interpreter.
Instances
Nullable (Endo BgzfTokens) Source # | |
Defined in Bio.Iteratee.Builder |
data BclSpecialType Source #