distribution-1.1.1.0: Finite discrete probability distributions.

Safe HaskellSafe
LanguageHaskell98

Data.Distribution.Core

Contents

Description

This modules defines types and functions for manipulating finite discrete probability distributions.

Synopsis

Probability

type Probability = Rational Source #

Probability. Should be between 0 and 1.

Distribution

data Distribution a Source #

Distribution over values of type a.

Due to their internal representations, Distribution can not have Functor or Monad instances. However, select is the equivalent of fmap for distributions and always and andThen are respectively the equivalent of return and >>=.

Instances

Bounded a => Bounded (Distribution a) Source #

Lifts the bounds to the distributions that return them with probability one.

Note that the degenerate distributions of size 0 will be less than the distribution minBound.

Appart from that, all other distributions d have the property that minBound <= d <= maxBound if this property holds on the values of the distribution.

Eq a => Eq (Distribution a) Source # 
(Ord a, Floating a) => Floating (Distribution a) Source # 
(Ord a, Fractional a) => Fractional (Distribution a) Source # 
(Ord a, Num a) => Num (Distribution a) Source #

Literals are interpreted as distributions that always return the given value.

>>> 42 == always 42
True

Binary operations on distributions are defined to be the binary operation on each pair of elements.

For this reason, (+) and (*) are not related in the same way as they are on natural numbers.

For instance, it is not always the case that: 3 * d == d + d + d

>>> let d = uniform [0, 1]
>>> 3 * d
fromList [(0,1 % 2),(3,1 % 2)]
>>> d + d + d
fromList [(0,1 % 8),(1,3 % 8),(2,3 % 8),(3,1 % 8)]

For this particular behavior, see the times function.

Ord a => Ord (Distribution a) Source #

A distribution d1 is less than some other distribution d2 if the smallest value that has a different probability in d1 and d2 is more probable in d1.

By convention, empty distributions are less than everything except themselves.

Show a => Show (Distribution a) Source # 
(Ord a, Monoid a) => Monoid (Distribution a) Source # 

toMap :: Distribution a -> Map a Probability Source #

Converts the distribution to a mapping from values to their probability. Values with probability 0 are not included in the resulting mapping.

toList :: Distribution a -> [(a, Probability)] Source #

Converts the distribution to a list of increasing values whose probability is greater than 0. To each value is associated its probability.

Properties

size :: Distribution a -> Int Source #

 Returns the number of elements with non-zero probability in the distribution.

support :: Distribution a -> Set a Source #

Values in the distribution with non-zero probability.

Creation

fromList :: (Ord a, Real p) => [(a, p)] -> Distribution a Source #

Distribution that assigns to each value from the given (value, weight) pairs a probability proportional to weight.

>>> fromList [('A', 1), ('B', 2), ('C', 1)]
fromList [('A',1 % 4),('B',1 % 2),('C',1 % 4)]

Values may appear multiple times in the list. In this case, their total weight is the sum of the different associated weights. Values whose total weight is zero or negative are ignored.

always :: a -> Distribution a Source #

Distribution that assigns to x the probability of 1.

>>> always 0
fromList [(0,1 % 1)]
>>> always 42
fromList [(42,1 % 1)]

uniform :: Ord a => [a] -> Distribution a Source #

Uniform distribution over the values. The probability of each element is proportional to its number of appearance in the list.

>>> uniform [1 .. 6]
fromList [(1,1 % 6),(2,1 % 6),(3,1 % 6),(4,1 % 6),(5,1 % 6),(6,1 % 6)]

withProbability :: Real p => p -> Distribution Bool Source #

True with given probability and False with complementary probability.

Transformation

select :: Ord b => (a -> b) -> Distribution a -> Distribution b Source #

Applies a function to the values in the distribution.

>>> select abs $ uniform [-1, 0, 1]
fromList [(0,1 % 3),(1,2 % 3)]

assuming :: (a -> Bool) -> Distribution a -> Distribution a Source #

Returns a new distribution conditioning on the predicate holding on the value.

>>> assuming (> 2) $ uniform [1 .. 6]
fromList [(3,1 % 4),(4,1 % 4),(5,1 % 4),(6,1 % 4)]

Note that the resulting distribution will be invalid if the predicate does not hold on any of the values.

>>> assuming (> 7) $ uniform [1 .. 6]
fromList []

observing :: (a -> Distribution Bool) -> Distribution a -> Distribution a Source #

Returns a new distribution using the Bayesian update rule.

Using this example: https://en.wikipedia.org/wiki/Bayesian_inference#Probability_of_a_hypothesis

data CookieBowl = Bowl1 | Bowl2 deriving (Eq,Ord)
data CookieType = Plain | ChocolateChip deriving (Eq,Ord)

assumption :: Distribution CookieBowl
assumption = uniform [Bowl1,Bowl2]

update :: Cookie -> Distribution CookieBowl -> Distribution CookieBowl
update c = observing f where
  f b = case b of
    -- Bowl #1 contains 10 chocolate chip cookies and 30 plain cookies
    Bowl1 -> fromList [(c == ChocolateChip,10),(c == Plain,30)]
    -- Bowl #2 contains 20 of each flavour of cookie
    Bowl2 -> fromList [(c == ChocolateChip,20),(c == Plain,20)]

The "update" function in this example can be used to update the probability distribution of which bowl you have based on observing a random cookie inside the bowl.

Combination

combineWith :: Ord b => (a -> a -> b) -> Distribution a -> Distribution a -> Distribution b Source #

Sequences

Independant experiments

trials :: Int -> Distribution Bool -> Distribution Int Source #

Binomial distribution. Assigns to each number of successes its probability.

>>> trials 2 $ uniform [True, False]
fromList [(0,1 % 4),(1,1 % 2),(2,1 % 4)]

times :: (Num a, Ord a) => Int -> Distribution a -> Distribution a Source #

Takes n samples from the distribution and returns the distribution of their sum.

>>> times 2 $ uniform [1 .. 3]
fromList [(2,1 % 9),(3,2 % 9),(4,1 % 3),(5,2 % 9),(6,1 % 9)]

This function makes use of the more efficient trials functions for input distributions of size 2.

>>> size $ times 10000 $ uniform [1, 10]
10001

iid :: Ord a => (a -> a -> a) -> Int -> Distribution a -> Distribution a Source #

Dependant experiments

andThen :: Ord b => Distribution a -> (a -> Distribution b) -> Distribution b infixl 7 Source #

Computes for each value in the distribution a new distribution, and then combines those distributions, giving each the weight of the original value.

>>> uniform [1 .. 3] `andThen` (\ n -> uniform [1 .. n])
fromList [(1,11 % 18),(2,5 % 18),(3,1 % 9)]

See the Experiment data type in the Monadic module for a more "natural" monadic interface.

Utilities

isValid :: Distribution a -> Bool Source #

 Determines if a distribution is valid.

A distribution is valid if and only if its domain is non-empty. Invalid distributions may arise from the use of assuming for instance.