biohazard-0.6.5: bioinformatics support library

Safe HaskellSafe
LanguageHaskell98

Bio.Util.Numeric

Synopsis

Documentation

wilson :: Double -> Int -> Int -> (Double, Double, Double) Source

Random useful stuff I didn't know where to put.

calculates the Wilson Score interval. If (l,m,h) = wilson c x n, then m is the binary proportion and (l,h) it's c-confidence interval for x positive examples out of n observations. c is typically something like 0.05.

invnormcdf :: (Ord a, Floating a) => a -> a Source

choose :: Integral a => a -> a -> a Source

Binomial coefficient: n choose k == n! / ((n-k)! k!)

estimateComplexity :: (Integral a, Floating b, Ord b) => a -> a -> Maybe b Source

Try to estimate complexity of a whole from a sample. Suppose we sampled total things and among those singles occured only once. How many different things are there?

Let the total number be m. The copy number follows a Poisson distribution with paramter lambda. Let z := e^{lambda}, then we have:

P( 0 ) = e^{-lambda} = 1/z P( 1 ) = lambda e^{-lambda} = ln z / z P(>=1) = 1 - e^{-lambda} = 1 - 1/z

singles = m ln z / z total = m (1 - 1/z)

D := totalsingles = (1 - 1z) * z / ln z f := z - 1 - D ln z = 0

To get z, we solve using Newton iteration and then substitute to get m:

dfdz = 1 - Dz z' := z - z (z - 1 - D ln z) / (z - D) m = singles * z /log z

It converges as long as the initial z is large enough, and 10D (in the line for zz below) appears to work well.

showNum :: Show a => a -> String Source

log1p :: (Floating a, Ord a) => a -> a Source

Computes log (1+x) to a relative precision of 10^-8 even for very small x. Stolen from http://www.johndcook.com/cpp_log_one_plus_x.html

expm1 :: (Floating a, Ord a) => a -> a Source

Computes exp x - 1 to a relative precision of 10^-10 even for very small x. Stolen from http://www.johndcook.com/cpp_expm1.html

(<#>) :: (Floating a, Ord a) => a -> a -> a infixl 5 Source

Computes log (exp x + exp y) without leaving the log domain and hence without losing precision.

lsum :: (Floating a, Ord a) => [a] -> a Source

Computes ( log ( sum_i e^{x_i} ) ) sensibly. The list must be sorted in descending(!) order.

llerp :: (Floating a, Ord a) => a -> a -> a -> a Source

Computes ( log left( c e^x + (1-c) e^y right) ).

sigmoid2 :: (Num a, Fractional a, Floating a) => a -> a Source

Kind-of sigmoid function that maps the reals to the interval [0,1). Good to compute a probability without introducing boundary conditions.

isigmoid2 :: (Num a, Fractional a, Floating a) => a -> a Source

Inverse of sigmoid2.