hyraxAbi-0.2.3.1: Modules for parsing, generating and manipulating AB1 files.

Copyright(c) HyraxBio 2018
LicenseBSD3
Maintainerandre@hyraxbio.co.za, andre@andrevdm.com
Safe HaskellSafe
LanguageHaskell2010

Hyrax.Abi.Generate

Description

Functionality for generating AB1 files from an input FASTA. These AB1s are supported by both PHRED and recall, if you are using other software you may need to add additional required sections.

Weighted reads

The input FASTA files have "weighted" reads. The name for each read is an value between 0 and 1 which specifies the height of the peak relative to a full peak.

Single read

The most simple example is a single FASTA with a single read with a weight of 1

> 1
ACTG

The chromatogram for this AB1 shows perfect traces for the input ACTG nucleotides with a full height peak.

Mixes & multiple reads

The source FASTA can have multiple reads, which results in a chromatogram with mixes

> 1
ACAG
> 0.3
ACTG

There is an AT mix at the third nucleotide. The first read has a weight of 1 and the second a weight of 0.3. The chromatogram shows the mix and the T with a lower peak (30% of the A peak)

Summing weights

  • The weigh of a read specifies the intensity of the peak from 0 to 1.
  • Weights for each position are added to a maximum of 1 per nucleotide
  • You can use `_` as a "blank" nucleotide, in which only the nucleotides from other reads will be considered

E.g.

> 1
ACAG
> 0.3
_GT
> 0.2
_G

See README.md for additional details and examples

Synopsis

Documentation

generateAb1s :: FilePath -> FilePath -> IO () Source #

Generate a set of AB1s. One for every FASTA found in the source directory

generateAb1 :: (Text, [(Double, Text)]) -> ByteString Source #

Create the ByteString data for an AB1 given the data from a weighted FASTA (see readWeightedFasta)

readWeightedFasta :: ByteString -> Either Text [(Double, Text)] Source #

Read a weighted FASTA file. See the module comments for the expected format. See the module documentation for details on the format of the weighted FASTA

e.g. weighted FASTA

> 1
ACAG
> 0.3
_GT
> 0.2
_G

The result data has the type

                      (Text, [(Double, Text)])
                       ^       ^       ^
                       |       |       |
file name -------------+       |       +---- read 
                               | 
                               +---- weight

iupac :: [[Char]] -> [Char] Source #

Given a set of nucleotides get the IUPAC ambiguity code

unIupac :: Char -> [Char] Source #

Convert a IUPAC ambiguity code to the set of nucleotides it represents