Safe Haskell | None |
---|---|
Language | Haskell98 |
- data GenoCallBlock = GenoCallBlock {
- reference_name :: !Refseq
- start_position :: !Int
- called_sites :: [GenoCallSite]
- data GenoCallSite = GenoCallSite {
- snp_stats :: !CallStats
- snp_likelihoods :: !(Vector Mini)
- ref_allele :: !Nucleotides
- indel_stats :: !CallStats
- indel_variants :: [IndelVariant]
- indel_likelihoods :: !(Vector Mini)
- compact_likelihoods :: Vector Prob -> Vector Mini
- getRefseqs :: AvroMeta -> Refs
Documentation
data GenoCallBlock Source
File format for genotype calls.
To output a container file, we need to convert calls into a stream of sensible objects. To cut down on redundancy, the object will have a header that names the reference sequence and the start, followed by calls. The calls themselves have contiguous coordinates, we start a new block if we have to skip; we also start a new block when we feel the current one is getting too large.
GenoCallBlock | |
|
data GenoCallSite Source
GenoCallSite | |
|
compact_likelihoods :: Vector Prob -> Vector Mini Source
Storing likelihoods: we take the natural logarithm (GL values are already in a log scale) and convert to minifloat 0.4.4 representation. Range and precision should be plenty.
getRefseqs :: AvroMeta -> Refs Source
Reconstructs the list of reference sequences from Avro metadata.
If a type named Refseq
is defined in the schema and is an enum, it
defines the symbol table, otherwise an empty list is returned. If
biohazard.refseq_length
exists, and is an array, it's elements are
interpreted as the lengths in order, otherwise the lengths are set to
zero.