Safe Haskell	None
Language	Haskell2010

Test.DejaFu

Contents

Testing
Testing with different settings
- Memory Models
- Schedule Bounding
Results
Predicates

Description

Deterministic testing for concurrent computations.

As an example, consider this program, which has two locks and a shared variable. Two threads are spawned, which claim the locks, update the shared variable, and release the locks. The main thread waits for them both to terminate, and returns the final result.

example1 :: MonadConc m => m Int
example1 = do
  a <- newEmptyMVar
  b <- newEmptyMVar

  c <- newMVar 0

  let lock m = putMVar m ()
  let unlock = takeMVar

  j1 <- spawn $ lock a >> lock b >> modifyMVar_ c (return . succ) >> unlock b >> unlock a
  j2 <- spawn $ lock b >> lock a >> modifyMVar_ c (return . pred) >> unlock a >> unlock b

  takeMVar j1
  takeMVar j2

  takeMVar c

The correct result is 0, as it starts out as 0 and is incremented and decremented by threads 1 and 2, respectively. However, note the order of acquisition of the locks in the two threads. If thread 2 pre-empts thread 1 between the acquisition of the locks (or if thread 1 pre-empts thread 2), a deadlock situation will arise, as thread 1 will have lock a and be waiting on b, and thread 2 will have b and be waiting on a.

Here is what Deja Fu has to say about it:

> autocheck example1
[fail] Never Deadlocks (checked: 2)
        [deadlock] S0---------S1--P2---S1-
[pass] No Exceptions (checked: 11)
[fail] Consistent Result (checked: 10)
        0 S0---------S1---------------S0--S2---------------S0----
        [deadlock] S0---------S1--P2---S1-
False

It identifies the deadlock, and also the possible results the computation can produce, and displays a simplified trace leading to each failing outcome. It also returns False as there are test failures. The automatic testing functionality is good enough if you only want to check your computation is deterministic, but if you have more specific requirements (or have some expected and tolerated level of nondeterminism), you can write tests yourself using the dejafu* functions.

Warning: If your computation under test does IO, the IO will be executed lots of times! Be sure that it is deterministic enough not to invalidate your test results. Mocking may be useful where possible.

Synopsis

Testing

Testing in Deja Fu is similar to unit testing, the programmer produces a self-contained monadic action to execute under different schedules, and supplies a list of predicates to apply to the list of results produced.

If you simply wish to check that something is deterministic, see the autocheck and autocheckIO functions.

These functions use a Total Store Order (TSO) memory model for unsynchronised actions, see "Testing under Alternative Memory Models" for some explanation of this.

autocheck Source

Arguments

:: (Eq a, Show a)
=> (forall t. ConcST t a)	The computation to test
-> IO Bool

Automatically test a computation. In particular, look for deadlocks, uncaught exceptions, and multiple return values.

This uses the Conc monad for testing, which is an instance of MonadConc. If you need to test something which also uses MonadIO, use autocheckIO.

dejafu Source

Arguments

:: Show a
=> (forall t. ConcST t a)	The computation to test
-> (String, Predicate a)	The predicate (with a name) to check
-> IO Bool

Check a predicate and print the result to stdout, return True if it passes.

dejafus Source

Arguments

:: Show a
=> (forall t. ConcST t a)	The computation to test
-> [(String, Predicate a)]	The list of predicates (with names) to check
-> IO Bool

Variant of dejafu which takes a collection of predicates to test, returning True if all pass.

autocheckIO :: (Eq a, Show a) => ConcIO a -> IO Bool Source

Variant of autocheck for computations which do IO.

dejafuIO :: Show a => ConcIO a -> (String, Predicate a) -> IO Bool Source

Variant of dejafu for computations which do IO.

dejafusIO :: Show a => ConcIO a -> [(String, Predicate a)] -> IO Bool Source

Variant of dejafus for computations which do IO.

Testing with different settings

autocheck' Source

Arguments

:: (Eq a, Show a)
=> MemType	The memory model to use for non-synchronised `CRef` operations.
-> (forall t. ConcST t a)	The computation to test
-> IO Bool

Variant of autocheck which tests a computation under a given memory model.

autocheckIO' :: (Eq a, Show a) => MemType -> ConcIO a -> IO Bool Source

Variant of autocheck' for computations which do IO.

dejafu' Source

Arguments

:: Show a
=> MemType	The memory model to use for non-synchronised `CRef` operations.
-> Bounds	The schedule bounds
-> (forall t. ConcST t a)	The computation to test
-> (String, Predicate a)	The predicate (with a name) to check
-> IO Bool

Variant of dejafu' which takes a memory model and schedule bounds.

Schedule bounding is used to filter the large number of possible schedules, and can be iteratively increased for further coverage guarantees. Empirical studies (/Concurrency Testing Using Schedule Bounding: an Empirical Study/, P. Thompson, A. Donaldson, and A. Betts) have found that many concurrency bugs can be exhibited with as few as two threads and two pre-emptions, which is part of what dejafus uses.

Warning: Using largers bounds will almost certainly significantly increase the time taken to test!

dejafus' Source

Arguments

:: Show a
=> MemType	The memory model to use for non-synchronised `CRef` operations.
-> Bounds	The schedule bounds.
-> (forall t. ConcST t a)	The computation to test
-> [(String, Predicate a)]	The list of predicates (with names) to check
-> IO Bool

Variant of dejafus which takes a memory model and schedule bounds.

dejafuIO' :: Show a => MemType -> Bounds -> ConcIO a -> (String, Predicate a) -> IO Bool Source

Variant of dejafu' for computations which do IO.

dejafusIO' :: Show a => MemType -> Bounds -> ConcIO a -> [(String, Predicate a)] -> IO Bool Source

Variant of dejafus' for computations which do IO.

Memory Models

Threads running under modern multicore processors do not behave as a simple interleaving of the individual thread actions. Processors do all sorts of complex things to increase speed, such as buffering writes. For concurrent programs which make use of non-synchronised functions (such as readCRef coupled with writeCRef) different memory models may yield different results.

As an example, consider this program (modified from the Data.IORef documentation). Two CRefs are created, and two threads spawned to write to and read from both. Each thread returns the value it observes.

example2 :: MonadConc m => m (Bool, Bool)
example2 = do
  r1 <- newCRef False
  r2 <- newCRef False

  x <- spawn $ writeCRef r1 True >> readCRef r2
  y <- spawn $ writeCRef r2 True >> readCRef r1

  (,) <$> readMVar x <*> readMVar y

Under a sequentially consistent memory model the possible results are (True, True), (True, False), and (False, True). Under total or partial store order, (False, False) is also a possible result, even though there is no interleaving of the threads which can lead to this.

We can see this by testing with different memory models:

> autocheck' SequentialConsistency example2
[pass] Never Deadlocks (checked: 6)
[pass] No Exceptions (checked: 6)
[fail] Consistent Result (checked: 5)
        (False,True) S0-------S1-----S0--S2-----S0---
        (True,False) S0-------S1-P2-----S1----S0----
        (True,True) S0-------S1--P2-----S1---S0----
        (False,True) S0-------S1---P2-----S1--S0----
        (True,False) S0-------S2-----S1-----S0----
        ...
False

> autocheck' TotalStoreOrder example2
[pass] Never Deadlocks (checked: 303)
[pass] No Exceptions (checked: 303)
[fail] Consistent Result (checked: 302)
        (False,True) S0-------S1-----C-S0--S2-----C-S0---
        (True,False) S0-------S1-P2-----C-S1----S0----
        (True,True) S0-------S1-P2--C-S1----C-S0--S2---S0---
        (False,True) S0-------S1-P2--P1--C-C-S1--S0--S2---S0---
        (False,False) S0-------S1-P2--P1----S2---C-C-S0----
        ...
False

Traces for non-sequentially-consistent memory models show where writes to CRefs are committed, which makes a write visible to all threads rather than just the one which performed the write. Only writeCRef is broken up into separate write and commit steps, atomicModifyCRef is still atomic and imposes a memory barrier.

data MemType Source

The memory model to use for non-synchronised CRef operations.

Constructors

SequentialConsistency	The most intuitive model: a program behaves as a simple interleaving of the actions in different threads. When a `CRef` is written to, that write is immediately visible to all threads.
TotalStoreOrder	Each thread has a write buffer. A thread sees its writes immediately, but other threads will only see writes when they are committed, which may happen later. Writes are committed in the same order that they are created.
PartialStoreOrder	Each `CRef` has a write buffer. A thread sees its writes immediately, but other threads will only see writes when they are committed, which may happen later. Writes to different `CRef`s are not necessarily committed in the same order that they are created.

Instances

Bounded MemType Source
Enum MemType Source
Eq MemType Source
Ord MemType Source
Read MemType Source
Show MemType Source
NFData MemType Source

defaultMemType :: MemType Source

The default memory model: TotalStoreOrder

Schedule Bounding

Schedule bounding is an optimisation which only considers schedules within some bound. This sacrifices completeness outside of the bound, but can drastically reduce the number of schedules to test, and is in fact necessary for non-terminating programs.

The standard testing mechanism uses a combination of pre-emption bounding, fair bounding, and length bounding. Pre-emption + fair bounding is useful for programs which use loop/yield control flows but are otherwise terminating. Length bounding makes it possible to test potentially non-terminating programs.

data Bounds Source

Constructors

Bounds
Fields boundPreemp :: Maybe PreemptionBound boundFair :: Maybe FairBound boundLength :: Maybe LengthBound

defaultBounds :: Bounds Source

All bounds enabled, using their default values.

noBounds :: Bounds Source

No bounds enabled. This forces the scheduler to just use partial-order reduction and sleep sets to prune the search space. This will ONLY work if your computation always terminated!

newtype PreemptionBound :: *

Constructors

PreemptionBound Int

Instances

Enum PreemptionBound
Eq PreemptionBound
Integral PreemptionBound
Num PreemptionBound
Ord PreemptionBound
Read PreemptionBound
Real PreemptionBound
Show PreemptionBound
NFData PreemptionBound

defaultPreemptionBound :: PreemptionBound

A sensible default preemption bound: 2.

See Concurrency Testing Using Schedule Bounding: an Empirical Study, P. Thomson, A. F. Donaldson, A. Betts for justification.

newtype FairBound :: *

Constructors

FairBound Int

Instances

Enum FairBound
Eq FairBound
Integral FairBound
Num FairBound
Ord FairBound
Read FairBound
Real FairBound
Show FairBound
NFData FairBound

defaultFairBound :: FairBound

A sensible default fair bound: 5.

This comes from playing around myself, but there is probably a better default.

newtype LengthBound :: *

Constructors

LengthBound Int

Instances

Enum LengthBound
Eq LengthBound
Integral LengthBound
Num LengthBound
Ord LengthBound
Read LengthBound
Real LengthBound
Show LengthBound
NFData LengthBound

defaultLengthBound :: LengthBound

A sensible default length bound: 250.

Based on the assumption that anything which executes for much longer (or even this long) will take ages to test.

Results

The results of a test can be pretty-printed to the console, as with the above functions, or used in their original, much richer, form for debugging purposes. These functions provide full access to this data type which, most usefully, contains a detailed trace of execution, showing what each thread did at each point.

data Result a Source

The results of a test, including the number of cases checked to determine the final boolean outcome.

Constructors

Result
Fields _pass :: Bool Whether the test passed or not. _casesChecked :: Int The number of cases checked. _failures :: [(Either Failure a, Trace ThreadId ThreadAction Lookahead)] The failing cases, if any. _failureMsg :: String A message to display on failure, if nonempty

Instances

Functor Result Source
Foldable Result Source
Show a => Show (Result a) Source
NFData a => NFData (Result a) Source

data Failure Source

An indication of how a concurrent computation failed.

Constructors

InternalError	Will be raised if the scheduler does something bad. This should never arise unless you write your own, faulty, scheduler! If it does, please file a bug report.
Abort	The scheduler chose to abort execution. This will be produced if, for example, all possible decisions exceed the specified bounds (there have been too many pre-emptions, the computation has executed for too long, or there have been too many yields).
Deadlock	The computation became blocked indefinitely on `MVar`s.
STMDeadlock	The computation became blocked indefinitely on `TVar`s.
UncaughtException	An uncaught exception bubbled to the top of the computation.

Instances

Bounded Failure Source
Enum Failure Source
Eq Failure Source
Ord Failure Source
Read Failure Source
Show Failure Source
NFData Failure Source

runTest Source

Arguments

:: Predicate a	The predicate to check
-> (forall t. ConcST t a)	The computation to test
-> Result a

Run a predicate over all executions within the default schedule bounds.

runTest' Source

Arguments

:: MemType	The memory model to use for non-synchronised `CRef` operations.
-> Bounds	The schedule bounds.
-> Predicate a	The predicate to check
-> (forall t. ConcST t a)	The computation to test
-> Result a

Variant of runTest which takes a memory model and schedule bounds.

runTestIO :: Predicate a -> ConcIO a -> IO (Result a) Source

Variant of runTest for computations which do IO.

runTestIO' :: MemType -> Bounds -> Predicate a -> ConcIO a -> IO (Result a) Source

Variant of runTest' for computations which do IO.

Predicates

Predicates evaluate a list of results of execution and decide whether some test case has passed or failed. They can be lazy and make use of short-circuit evaluation to avoid needing to examine the entire list of results, and can check any property which can be defined for the return type of your monadic action.

A collection of common predicates are provided, along with the helper functions alwaysTrue, alwaysTrue2 and somewhereTrue to lfit predicates over a single result to over a collection of results.

type Predicate a = [(Either Failure a, Trace ThreadId ThreadAction Lookahead)] -> Result a Source

A Predicate is a function which collapses a list of results into a Result.

representative :: Eq a => Predicate a -> Predicate a Source

Reduce the list of failures in a Predicate to one representative trace for each unique result.

This may throw away "duplicate" failures which have a unique cause but happen to manifest in the same way. However, it is convenient for filtering out true duplicates.

abortsNever :: Predicate a Source

Check that a computation never aborts.

abortsAlways :: Predicate a Source

Check that a computation always aborts.

abortsSometimes :: Predicate a Source

Check that a computation aborts at least once.

deadlocksNever :: Predicate a Source

Check that a computation never deadlocks.

deadlocksAlways :: Predicate a Source

Check that a computation always deadlocks.

deadlocksSometimes :: Predicate a Source

Check that a computation deadlocks at least once.

exceptionsNever :: Predicate a Source

Check that a computation never fails with an uncaught exception.

exceptionsAlways :: Predicate a Source

Check that a computation always fails with an uncaught exception.

exceptionsSometimes :: Predicate a Source

Check that a computation fails with an uncaught exception at least once.

alwaysSame :: Eq a => Predicate a Source

Check that the result of a computation is always the same. In particular this means either: (a) it always fails in the same way, or (b) it never fails and the values returned are all equal.

notAlwaysSame :: Eq a => Predicate a Source

Check that the result of a computation is not always the same.

alwaysTrue :: (Either Failure a -> Bool) -> Predicate a Source

Check that the result of a unary boolean predicate is always true.

alwaysTrue2 :: (Either Failure a -> Either Failure a -> Bool) -> Predicate a Source

Check that the result of a binary boolean predicate is true between all pairs of results. Only properties which are transitive and symmetric should be used here.

If the predicate fails, both (result,trace) tuples will be added to the failures list.

somewhereTrue :: (Either Failure a -> Bool) -> Predicate a Source

Check that the result of a unary boolean predicate is true at least once.

gives :: (Eq a, Show a) => [Either Failure a] -> Predicate a Source

Predicate for when there is a known set of results where every result must be exhibited at least once.

gives' :: (Eq a, Show a) => [a] -> Predicate a Source

Variant of gives that doesn't allow for expected failures.