Copyright | (c) Matthew Mosior 2022 |
---|---|
License | BSD-style |
Maintainer | mattm.github@gmail.com |
Portability | portable |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
WARNING
This module is considered internal.
The Package Versioning Policy does not apply.
The contents of this module may change in any way whatsoever and without any warning between minor versions of this package.
Authors importing this library are expected to track development closely.
All credit goes to the author(s)/maintainer(s) of the containers library for the above warning text.
Description
Various data structures and custom data types to describe the Burrows-Wheeler Transform (BWT) and the Inverse BWT.
The implementation of the BWT relies upon sequence provided by the containers.
The internal BWTMatrix
data type relies upon the massiv package.
Synopsis
- data Suffix = Suffix {
- suffixindex :: Int
- suffixstartpos :: Int
- suffix :: Seq Char
- type SuffixArray = Seq Suffix
- type BWT = Seq Char
- type BWTMatrix = Array BN Ix1 String
- saToBWT :: SuffixArray -> Seq Char -> BWT
- createSuffixArray :: Seq Char -> SuffixArray
- sortTB :: (Ord a1, Ord a2) => (a1, a2) -> (a1, a2) -> Ordering
- type BWTSeq a = Seq Char
- type STBWTSeq s a = STRef s (BWTSeq Char)
- pushSTBWTSeq :: STBWTSeq s Char -> Char -> ST s ()
- emptySTBWTSeq :: ST s (STBWTSeq s Char)
- type STBWTCounter s a = STRef s Int
- updateSTBWTCounter :: STBWTCounter s Int -> Int -> ST s ()
- emptySTBWTCounter :: ST s (STBWTCounter s Int)
- magicInverseBWT :: Seq (Char, Int) -> ST s (BWTSeq Char)
- grabHeadChunks :: Seq (Seq Char) -> (Seq Char, Seq Char)
- createBWTMatrix :: String -> BWTMatrix
Documentation
Basic suffix data type. Used to describe
the core data inside of the SuffixArray
data type.
Suffix | |
|
Instances
Generic Suffix Source # | |
Read Suffix Source # | |
Show Suffix Source # | |
Eq Suffix Source # | |
Ord Suffix Source # | |
type Rep Suffix Source # | |
Defined in Data.BWT.Internal type Rep Suffix = D1 ('MetaData "Suffix" "Data.BWT.Internal" "text-compression-0.1.0.4-CoUFWULdPvhKvJlYK0c1My" 'False) (C1 ('MetaCons "Suffix" 'PrefixI 'True) (S1 ('MetaSel ('Just "suffixindex") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Int) :*: (S1 ('MetaSel ('Just "suffixstartpos") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Int) :*: S1 ('MetaSel ('Just "suffix") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 (Seq Char))))) |
type SuffixArray = Seq Suffix Source #
The SuffixArray data type. Uses sequence internally.
type BWTMatrix = Array BN Ix1 String Source #
The BWTMatrix data type. Uses a massiv array internally.
saToBWT :: SuffixArray -> Seq Char -> BWT Source #
Computes the Burrows-Wheeler Transform (BWT) using the suffix array and the original string (represented as a sequence for performance).
createSuffixArray :: Seq Char -> SuffixArray Source #
Computes the corresponding SuffixArray
of a given string. Please see suffix array
for more information.
sortTB :: (Ord a1, Ord a2) => (a1, a2) -> (a1, a2) -> Ordering Source #
Hierarchical sorting scheme that compares fst first then snd. Necessary for the setting up the BWT in order to correctly invert it using the Magic algorithm.
type STBWTSeq s a = STRef s (BWTSeq Char) Source #
Abstract data type representing a BWTSeq in the (strict) ST monad.
pushSTBWTSeq :: STBWTSeq s Char -> Char -> ST s () Source #
State function to push BWTString data into stack.
type STBWTCounter s a = STRef s Int Source #
Abstract BWTCounter and associated state type.
updateSTBWTCounter :: STBWTCounter s Int -> Int -> ST s () Source #
State function to update BWTCounter.
emptySTBWTCounter :: ST s (STBWTCounter s Int) Source #
State function to create empty STBWTCounter type.
grabHeadChunks :: Seq (Seq Char) -> (Seq Char, Seq Char) Source #
Easy way to grab the first two elements of a sequence.
createBWTMatrix :: String -> BWTMatrix Source #
Simple yet efficient implementation of converting a given string into a BWT Matrix (the BWTMatrix type is a massiv array).