# safe-coupling Library for relational verification of probabilistic algorithms. Supports two proving methods: - Upper bound _Kantorovich distance_ between two distributions - Establish a _boolean relation_ on samples from two distributions (this is stronger) Includes two larger examples of verification: - Stability of stochastic gradient descent (src/SGD) using Kantorovich distance - Convergence of temporal difference learning (src/TD0) using boolean relations ## A smaller example (src/Bins/Bins.hs) This function recursively counts how many times the ball hit the bin after n attempted throws: bins :: Double -> Nat -> PrM Nat bins _ 0 = ppure 0 bins p n = liftA2 (+) (bernoulli p) (bins p (n - 1)) Throws succeed with probability _p_ which is simulated by `bernoulli p`. The function returns a distribution over natural numbers. When comparing results of two throwers with respective chances of success _p_ and _q > p_, we expect the second thrower to score notably better with the increase of _n_. Formally, we can show that Kantorovich distance between `bins p n` and `bins q n` is upper bounded by _(q - p)·n_. ## Proof (src/Bins/Theorem.hs) The proof uses four definitions from the library: * In the first case, no throws were made. Axiom `pureDist` allows deriving Kantorovich distance between pure expressions. In our case, _0_ and _0_. * In the second case, axiom `liftA2Dist` derives Kantorovich distance between the inductive cases. Numeric arguments specify the expected bound in format _a·x + b·y + c_ where _x_ and _y_ are bounds for the second and third arguments of `liftA2` respectively. As the last argument, the axiom requires proof of linearity of plus. It is empty since it can be automatically constructed by an SMT-solver. * Axiom `bernoulliDist` upper bounds the distance between calls to `bernoulli` with _q - p_ — this is our _x_. The second upper bound _y_ is provided by a recursive call to our theorem. * A function `distInt` is used to measure the distance between arguments of `liftA2`. In this case, all of them provide integer values. A pre-defined distance between _n_ and _m_ is _|n - m|_ but this allows customization. ``` {-@ binsDist :: p:Prob -> {q:Prob|p <= q} -> n:Nat -> {dist (kant distInt) (bins p n) (bins q n) <= n * (q - p)} / [n] @-} binsDist :: Prob -> Prob -> Nat -> () binsDist p q 0 = pureDist distInt 0 0 binsDist p q n = liftA2Dist d d d 1 (q - p) 1 ((n - 1) * (q - p)) 0 (+) (bernoulli p) (bins p (n - 1)) (+) (bernoulli q) (bins q (n - 1)) (bernoulliDist d p q) (binsDist p q (n - 1)) (\_ _ _ _ -> ()) where d = distInt ``` This concludes the mechanized proof of the boundary _(q-p)·n_. ## Installation 1. Install stack https://docs.haskellstack.org/en/stable/install_and_upgrade/ 2. Compile the library and case studies $ cd safe-coupling $ stack install --fast ... Registering library for safe-coupling-0.1.0.0.. 3. Run unit tests on executable case studies $ stack test ... test/Spec.hs Spec Bins mockbins 1 it: OK mockbins 2 it: OK bins 1 it: OK bins 2 it: OK (0.02s) exp dist mockbins: OK (0.12s) SGD sgd: OK TD0 td0 base: OK td0 simple: OK All 8 tests passed (0.12s) safe-coupling> Test suite safe-coupling-test passed Completed 2 action(s). In case of errors try $ stack clean