Safe Haskell | None |
---|---|
Language | Haskell2010 |
Quality filters adapted from prehistoric pipeline.
Synopsis
- filterPairs :: Monad m => (BamRec -> [BamRec]) -> (Maybe BamRec -> Maybe BamRec -> [BamRec]) -> Stream (Of BamRec) m r -> Stream (Of BamRec) m r
- type QualFilter = BamRec -> BamRec
- complexSimple :: Double -> QualFilter
- complexEntropy :: Double -> QualFilter
- qualityAverage :: Int -> QualFilter
- qualityMinimum :: Int -> Qual -> QualFilter
- qualityFromOldIllumina :: BamRec -> BamRec
- qualityFromNewIllumina :: BamRec -> BamRec
Documentation
filterPairs :: Monad m => (BamRec -> [BamRec]) -> (Maybe BamRec -> Maybe BamRec -> [BamRec]) -> Stream (Of BamRec) m r -> Stream (Of BamRec) m r Source #
A filter/transformation applied to pairs of reads. We supply a predicate to be applied to single reads and one to be applied to pairs, the latter can get incomplete pairs, too, if mates have been separated or filtered asymmetrically. This fails spectacularly if the input isn't grouped by name.
type QualFilter = BamRec -> BamRec Source #
A quality filter is simply a transformation on BamRec
s. By
convention, quality filters should set flagFailsQC
, a further step
can then remove the failed reads. Filtering of individual reads
tends to result in mate pairs with inconsistent flags, which in turn
will result in lone mates and all sort of troubles with programs that
expect non-broken BAM files. It is therefore recommended to use
pairFilter
with suitable predicates to do the post processing.
complexSimple :: Double -> QualFilter Source #
Simple complexity filter aka "Nancy Filter". A read is considered
not-sufficiently-complex if the most common base accounts for greater
than the cutoff
fraction of all non-N bases.
complexEntropy :: Double -> QualFilter Source #
Filter on order zero empirical entropy. Entropy per base must be greater than cutoff.
qualityAverage :: Int -> QualFilter Source #
Filter on average quality. Reads without qualities pass.
qualityMinimum :: Int -> Qual -> QualFilter Source #
Filter on minimum quality. In qualityMinimum n q
, a read passes
if it has no more than n
bases with quality less than q
. Reads
without qualities pass.
qualityFromOldIllumina :: BamRec -> BamRec Source #
Convert quality scores from old Illumina scale (different formula and offset 64 in FastQ).
qualityFromNewIllumina :: BamRec -> BamRec Source #
Convert quality scores from new Illumina scale (standard formula but offset 64 in FastQ).