Safe Haskell | None |
---|
Infernal CMs.
TODO order of nucleotides? ACGU?
TODO fastCM :: CM -> FastCM to make a data structure that is suitable for high-performance applications.
- data CMVersion
- data NodeType
- newtype NodeID = NodeID {}
- data StateType
- newtype StateID = StateID {}
- illegalState :: StateID
- data Emits
- = EmitsSingle { }
- | EmitsPair { }
- | EmitNothing
- single :: Traversal' Emits [(Char, BitScore)]
- pair :: Traversal' Emits [(Char, Char, BitScore)]
- data State = State {}
- transitions :: Lens' State [(StateID, BitScore)]
- stateType :: Lens' State StateType
- stateID :: Lens' State StateID
- nodeType :: Lens' State NodeType
- nodeID :: Lens' State NodeID
- emits :: Lens' State Emits
- data CM = CM {
- _name :: Identification Rfam
- _accession :: Accession Rfam
- _version :: CMVersion
- _trustedCutoff :: BitScore
- _gathering :: BitScore
- _noiseCutoff :: Maybe BitScore
- _nullModel :: Vector BitScore
- _nodes :: Map NodeID (NodeType, [StateID])
- _states :: Map StateID State
- _localBegin :: Map StateID BitScore
- _localEnd :: Map StateID BitScore
- _unsorted :: Map ByteString ByteString
- _hmm :: Maybe HMM3
- version :: Lens' CM CMVersion
- unsorted :: Lens' CM (Map ByteString ByteString)
- trustedCutoff :: Lens' CM BitScore
- states :: Lens' CM (Map StateID State)
- nullModel :: Lens' CM (Vector BitScore)
- noiseCutoff :: Lens' CM (Maybe BitScore)
- nodes :: Lens' CM (Map NodeID (NodeType, [StateID]))
- name :: Lens' CM (Identification Rfam)
- localEnd :: Lens' CM (Map StateID BitScore)
- localBegin :: Lens' CM (Map StateID BitScore)
- hmm :: Lens' CM (Maybe HMM3)
- gathering :: Lens' CM BitScore
- accession :: Lens' CM (Accession Rfam)
- type ID2CM = Map (Identification Rfam) CM
- type AC2CM = Map (Accession Rfam) CM
- makeLocal :: Double -> Double -> CM -> CM
- makeLocalBegin :: Double -> CM -> CM
- makeLocalEnd :: Double -> CM -> CM
Documentation
Encode the CM versions we can parse
Encode CM node types.
Encode CM state types.
State IDs
Certain states (IL,IR,ML,MR) emit a single nucleotide, one state emits a pair (MP), other states emit nothing.
A single state.
This is an Infernal covariance model. We have a number of blocks:
- basic information like the name of the CM, accession number, etc.
- advanced information: nodes and their states, and the states themselves.
- unsorted information from the header / blasic block
The CM
data structure is not suitable for high-performance applications.
- score inequalities: trusted (lowest seed score) >= gathering (lowest full score) >= noise (random strings)
Local entries into the CM.
The localBegin lens returns a map of state id's. We either have just the root node (with the S state), or a set of states with type: MP,ML,MR,B.
The localEnd lens on the other hand is the set of possible early exits from the model.
CM | |
|
makeLocalBegin :: Double -> CM -> CMSource
Insert all legal local beginnings, disable root node (and root states).
The pbegin
probability the the total probability for local begins. The
remaining 1-pbegin is the probability to start with node 1.
makeLocalEnd :: Double -> CM -> CMSource
Insert all legal local ends.