generic-random-0.1.0.0: Generic random generators

Safe HaskellNone
LanguageHaskell2010

Data.Random.Generics

Contents

Description

Generic Boltzmann samplers.

Here, the words "sampler" and "generator" are used interchangeably.

Given an algebraic datatype:

data A = A1 B C | A2 D

a Boltzmann sampler is recursively defined by choosing a constructor with some fixed distribution, and independently generating values for the corresponding fields with the same method.

A key component is the aforementioned distribution, defined for every type such that the resulting generator produces a finite value in the end. These distributions are obtained from a precomputed object called oracle, which we will not describe further here.

Oracles depend on the target size of the generated data (except for singular samplers), and can be fairly expensive to compute repeatedly, hence some of the functions below attempt to avoid (re)computing too many of them even when the required size changes.

When these functions are specialized, oracles are memoized and will be reused for different sizes.

Synopsis

Documentation

type Size' = Int Source

The size of a value is its number of constructors.

Here, however, the Size' type is interpreted differently to make better use of QuickCheck's size parameter provided by the sized combinator, so that we generate non-trivial data even at very small size values.

For infinite types, with objects of unbounded sizes > minSize, given a parameter delta :: Size', the produced values have an average size close to minSize + delta.

For example, values of type Either () [Bool] have at least two constructors, so

  generator delta :: Gen (Either () [Bool])

will target sizes close to 2 + delta; the offset becomes less noticeable as delta grows to infinity.

For finite types with sizes in [minSize, maxSize], the target expected size is obtained by clamping a Size' to [0, 99] and applying an affine mapping.

Main functions

Suffixes

S
Singular sampler.

This works with recursive tree-like structures, as opposed to (lists of) structures with bounded size. More precisely, the generating function of the given type should have a finite radius of convergence, with a singularity of a certain kind (see Duchon et al., reference in the README), so that the oracle can be evaluated at that point.

This has the advantage of using the same oracle for all size parameters, which simply specify a target size interval.

P
Generator of pointed values.

It usually has a flatter distribution of sizes than a simple Boltzmann sampler, making it an efficient alternative to rejection sampling.

It also works on more types, particularly lists and finite types, but relies on multiple oracles.

R
Rejection sampling.

These generators filter out values whose sizes are not within some interval. In the first two sections, that interval is implicit: [(1-epsilon)*size', (1+epsilon)*size'], for epsilon = 0.1.

The generator restarts as soon as it has produced more constructors than the upper bound, this strategy is called ceiled rejection sampling.

Pointing

The pointing of a type t is a derived type whose values are essentially values of type t, with one of their constructors being "pointed". Alternatively, we may turn every constructor into variants that indicate the position of points.

  -- Original type
  data Tree = Node Tree Tree | Leaf
  -- Pointing of Tree
  data Tree'
    = Tree' Tree -- Point at the root
    | Node'0 Tree' Tree -- Point to the left
    | Node'1 Tree Tree' -- Point to the right

Pointed values are easily mapped back to the original type by erasing the point. Pointing makes larger values occur much more frequently, while preserving the uniformness of the distribution conditionally to a fixed size.

generatorSR :: (Data a, MonadRandomLike m) => Size' -> m a Source

  generatorSR :: Int -> Gen a
  asMonadRandom . generatorSR :: MonadRandom m => Int -> m a

Singular ceiled rejection sampler.

generatorP :: (Data a, MonadRandomLike m) => Size' -> m a Source

  generatorP :: Int -> Gen a
  asMonadRandom . generatorP :: MonadRandom m => Int -> m a

Generator of pointed values.

generatorPR :: (Data a, MonadRandomLike m) => Size' -> m a Source

Pointed generator with rejection.

generatorR :: (Data a, MonadRandomLike m) => Size' -> m a Source

Generator with rejection and dynamic average size.

Fixed size

The ' suffix indicates functions which do not do any precomputation before passing the size parameter.

This means that oracles are computed from scratch for every size value, which may incur a significant overhead.

generatorP' :: (Data a, MonadRandomLike m) => Size' -> m a Source

Pointed generator.

generatorPR' :: (Data a, MonadRandomLike m) => Size' -> m a Source

Pointed generator with rejection.

generatorR' :: (Data a, MonadRandomLike m) => Size' -> m a Source

Ceiled rejection sampler with given average size.

generator' :: (Data a, MonadRandomLike m) => Size' -> m a Source

Basic boltzmann sampler with no optimization.

Generators with aliases

Boltzmann samplers can normally be defined only for types a such that:

  • they are instances of Data;
  • the set of types of subterms of values of type a is finite;
  • and all of these types have at least one finite value (i.e., values with finitely many constructors).

Examples of misbehaving types are:

  • a -> b -- Not Data
  • data E a = L a | R (E [a]) -- Contains a, [a], [[a]], [[[a]]], etc.
  • data I = C I -- No finite value

Alias

The Alias type works around these limitations (AliasR for rejection samplers). This existential wrapper around a user-defined function f :: a -> m b makes generic-random view occurences of the type b as a when processing a recursive system of types, possibly stopping some infinite unrolling of type definitions. When a value of type b needs to be generated, it generates an a which is passed to f.

  let
    as = [aliasR $ \() -> return (L []) :: Gen (E [[Int]])]
  in
    generatorSRWith as asGen :: Size -> Gen (E Int)

Another use case is to plug in user-defined generators where the default is not satisfactory, for example, to get positive Ints:

  let
    as = [alias $ \() -> choose (0, 100) :: Gen Int)]
  in
    generatorPWith as asGen :: Size -> Gen [Int]

Fixed size

Other generators

Used in the implementation of the generators above. These also allow to apply pointing more than once.

Suffixes

M
Sized generators are memoized for some sparsely chosen values of sizes. Subsequently supplied sizes are approximated by the closest larger value. This strategy avoids recomputing too many oracles. Aside from singular samplers, all other generators above not marked by ' use this.
_
If the size parameter is Nothing, produces the singular generator (associated with the suffix S); otherwise the generator produces values with average size equal to the given value.

type Points = Int Source

Number of pointing iterations.

generatorM :: (Data a, MonadRandomLike m) => [Alias m] -> Points -> Size' -> m a Source

generatorMR :: (Data a, MonadRandomLike m) => [AliasR m] -> Points -> Size' -> (Size', Size') -> m a Source

generator_ :: (Data a, MonadRandomLike m) => [Alias m] -> Points -> Maybe Size' -> m a Source

Boltzmann sampler without rejection.

generatorR_ :: (Data a, MonadRandomLike m) => [AliasR m] -> Points -> Maybe Size' -> (Size', Size') -> m a Source

Boltzmann sampler with rejection.

Auxiliary definitions

Type classes

class Monad m => MonadRandomLike m where Source

MonadRandomLike m defines basic components to build generators, allowing the implementation to remain abstract over both the Gen type and MonadRandom instances.

For the latter, the wrapper AMonadRandom is provided to avoid overlapping instances.

Minimal complete definition

doubleR, integerR, int, double, char

Methods

incr :: m () Source

Called for every constructor. Counter for ceiled rejection sampling.

doubleR :: Double -> m Double Source

doubleR upperBound: generates values in [0, upperBound].

integerR :: Integer -> m Integer Source

integerR upperBound: generates values in [0, upperBound-1].

int :: m Int Source

Default Int generator.

double :: m Double Source

Default Double generator.

char :: m Char Source

Default Char generator.

Alias

alias :: (Monad m, Data a, Data b) => (a -> m b) -> Alias m Source

Main constructor for Alias.

aliasR :: (Monad m, Data a, Data b) => (a -> m b) -> AliasR m Source

Main constructor for AliasR.

coerceAlias :: Coercible m n => Alias m -> Alias n Source

coerceAlias :: Alias m -> Alias (AMonadRandom m)

coerceAliases :: Coercible m n => [Alias m] -> [Alias n] Source

coerceAliases :: [Alias m] -> [Alias (AMonadRandom m)]

data Alias m where Source

Constructors

Alias :: (Data a, Data b) => !(m a -> m b) -> Alias m 

Instances

Show (Alias m) Source

Dummy instance for debugging.

type AliasR m = Alias (RejectT m) Source