Copyright | (c) HyraxBio 2018 |
---|---|
License | BSD3 |
Maintainer | andre@hyraxbio.co.za, andre@andrevdm.com |
Safe Haskell | Safe |
Language | Haskell2010 |
Functionality for generating AB1 files from an input FASTA. These AB1s are supported by both PHRED and recall, if you are using other software you may need to add additional required sections.
Weighted reads
The input FASTA files have "weighted" reads. The name for each read is an value between 0 and 1 which specifies the height of the peak relative to a full peak.
Single read
The most simple example is a single FASTA with a single read with a weight of 1
> 1 ACTG
The chromatogram for this AB1 shows perfect traces for the input ACTG
nucleotides with a full height peak.
Mixes & multiple reads
The source FASTA can have multiple reads, which results in a chromatogram with mixes
> 1 ACAG > 0.3 ACTG
There is an AT
mix at the third nucleotide. The first read has a weight of 1 and the second a weight of 0.3.
The chromatogram shows the mix and the T
with a lower peak (30% of the A
peak)
Summing weights
- The weigh of a read specifies the intensity of the peak from 0 to 1.
- Weights for each position are added to a maximum of 1 per nucleotide
- You can use `_` as a "blank" nucleotide, in which only the nucleotides from other reads will be considered
E.g.
> 1 ACAG > 0.3 _GT > 0.2 _G
See README.md for additional details and examples
Synopsis
- generateAb1s :: FilePath -> FilePath -> IO ()
- generateAb1 :: (Text, [(Double, Text)]) -> ByteString
- readWeightedFasta :: ByteString -> Either Text [(Double, Text)]
- iupac :: [[Char]] -> [Char]
- unIupac :: Char -> [Char]
Documentation
generateAb1s :: FilePath -> FilePath -> IO () Source #
Generate a set of AB1s. One for every FASTA found in the source directory
generateAb1 :: (Text, [(Double, Text)]) -> ByteString Source #
Create the ByteString
data for an AB1 given the data from a weighted FASTA (see readWeightedFasta
)
readWeightedFasta :: ByteString -> Either Text [(Double, Text)] Source #
Read a weighted FASTA file. See the module comments for the expected format. See the module documentation for details on the format of the weighted FASTA
e.g. weighted FASTA
> 1 ACAG > 0.3 _GT > 0.2 _G
The result data has the type
(Text
, [(Double
,Text
)]) ^ ^ ^ | | | file name -------------+ | +---- read | +---- weight