Safe Haskell | Safe |
---|---|

Language | Haskell98 |

- wilson :: Double -> Int -> Int -> (Double, Double, Double)
- invnormcdf :: (Ord a, Floating a) => a -> a
- choose :: Integral a => a -> a -> a
- estimateComplexity :: (Integral a, Floating b, Ord b) => a -> a -> Maybe b
- showNum :: Show a => a -> String
- showOOM :: Double -> String
- log1p :: (Floating a, Ord a) => a -> a
- expm1 :: (Floating a, Ord a) => a -> a
- (<#>) :: (Floating a, Ord a) => a -> a -> a
- lsum :: (Floating a, Ord a) => [a] -> a
- llerp :: (Floating a, Ord a) => a -> a -> a -> a
- sigmoid2 :: (Num a, Fractional a, Floating a) => a -> a
- isigmoid2 :: (Num a, Fractional a, Floating a) => a -> a

# Documentation

wilson :: Double -> Int -> Int -> (Double, Double, Double) Source

Random useful stuff I didn't know where to put.

calculates the Wilson Score interval.
If `(l,m,h) = wilson c x n`

, then `m`

is the binary proportion and
`(l,h)`

it's `c`

-confidence interval for `x`

positive examples out of
`n`

observations. `c`

is typically something like 0.05.

invnormcdf :: (Ord a, Floating a) => a -> a Source

estimateComplexity :: (Integral a, Floating b, Ord b) => a -> a -> Maybe b Source

Try to estimate complexity of a whole from a sample. Suppose we
sampled `total`

things and among those `singles`

occured only once.
How many different things are there?

Let the total number be `m`

. The copy number follows a Poisson
distribution with paramter `lambda`

. Let `z := e^{lambda}`

, then
we have:

P( 0 ) = e^{-lambda} = 1/z P( 1 ) = lambda e^{-lambda} = ln z / z P(>=1) = 1 - e^{-lambda} = 1 - 1/z

singles = m ln z / z total = m (1 - 1/z)

D := total*singles = (1 - 1*z) * z / ln z
f := z - 1 - D ln z = 0

To get `z`

, we solve using Newton iteration and then substitute to
get `m`

:

df*dz = 1 - D*z
z' := z - z (z - 1 - D ln z) / (z - D)
m = singles * z /log z

It converges as long as the initial `z`

is large enough, and `10D`

(in the line for `zz`

below) appears to work well.

log1p :: (Floating a, Ord a) => a -> a Source

Computes `log (1+x)`

to a relative precision of `10^-8`

even for
very small `x`

. Stolen from http://www.johndcook.com/cpp_log_one_plus_x.html

expm1 :: (Floating a, Ord a) => a -> a Source

Computes `exp x - 1`

to a relative precision of `10^-10`

even for
very small `x`

. Stolen from http://www.johndcook.com/cpp_expm1.html

(<#>) :: (Floating a, Ord a) => a -> a -> a infixl 5 Source

Computes `log (exp x + exp y)`

without leaving the log domain and
hence without losing precision.

lsum :: (Floating a, Ord a) => [a] -> a Source

Computes ( log ( sum_i e^{x_i} ) ) sensibly. The list must be sorted in descending(!) order.

llerp :: (Floating a, Ord a) => a -> a -> a -> a Source

Computes ( log left( c e^x + (1-c) e^y right) ).

sigmoid2 :: (Num a, Fractional a, Floating a) => a -> a Source

Kind-of sigmoid function that maps the reals to the interval
`[0,1)`

. Good to compute a probability without introducing boundary
conditions.