--- layout: post title: "Refinements 101 (contd.)" date: 2013-01-27 16:12 author: Ranjit Jhala published: true comments: true external-url: categories: basic demo: refinements101reax.hs --- Hopefully, the [previous][ref101] article gave you a basic idea about what refinement types look like. Several folks had interesting questions, that are worth discussing in a separate post, since they throw a lot of light on the strengths (or weaknesses, depending on your point of view!) of LiquidHaskell. \begin{code} module Refinements101Reax where \end{code} How to relate outputs and inputs -------------------------------- Recall the function `divide` \begin{code} {-@ divide :: Int -> {v: Int | v /= 0 } -> Int @-} divide :: Int -> Int -> Int divide n 0 = error "divide by zero" divide n d = n `div` d \end{code} and `abz` was the absolute value function \begin{code} abz :: Int -> Int abz n | 0 < n = n | otherwise = 0 - n \end{code} [nanothief](http://www.reddit.com/user/nanothief) [remarked][qreddit101] that LiquidHaskell was unable to verify the safety of the following call to `divide` (i.e. was unable to show that `x` was non-zero at the callsite). \begin{code} {-@ f :: Int -> Int @-} f x | abz x == 0 = 3 | otherwise = 3 `divide` x \end{code} Nanothief correctly argues that the code is clearly safe as *"`abz x == 0` being false implies `x /= 0`"*. Indeed, the code *is safe*, however, the reason that LiquidHaskell rejected it has nothing to do with its inability to *"track the constraints of values based on tests using new values derived from that value"* as Nanothief surmised, but instead, because LiquidHaskell supports **modular verification** where the *only* thing known about `abz` at a *use site* is whatever is specified in its *type*. \begin{code}Concretely speaking, the type abz :: Int -> {v: Int | 0 <= v } \end{code} is too anemic to verify `f` above, as it tells us nothing about the *relationship* between the output and input -- looking at it, we have now way of telling that when the *output* (of `abz`) is non-zero, the *input* must also have been non-zero. \begin{code}Instead, we can write a *stronger* type which does capture this information, for example abz :: x:Int -> {v:Int | v = (if (x > 0) then x else (0 - x))} \end{code} \begin{code} where v = (if p then e1 else e2) \end{code} \begin{code} is an abbreviation for the formula (p => v == e1) && ((not p) => v = e2) \end{code} With this specification for `abz`, LiquidHaskell is able to reason that when `abz x` is non-zero, `x` is also non-zero. Of course, `abz` is trivial enough that we can very precisely capture its *exact* semantics in the refinement type, but thats is rarely the case. Nevertheless, even here, you could write a somewhat *weaker* specification, that still had enough juice to allow the verification of the `divide` call in `f`. In particular, we might write \begin{code} {-@ abz :: x:Int -> {v:Int | ((0 <= v) && ((v = 0) <=> (x = 0))) } @-} \end{code} which states the output is `0` *if and only if* the input is `0`. LiquidHaskell will happily verify that `abz` implements this specification, and will use it to verify the safety of `f` above. (BTW, follow the link above to *demo this code* yourself.) How to tell a Fib ----------------- [Chris Done](https://twitter.com/christopherdone) [asked][qblog101] why LiquidHaskell refused to verify the following definition of `fib`. \begin{code} {-@ fib :: n:Int -> { b:Int | (n >= 0 && b >= n) } @-} fib :: Int -> Int fib 0 = 0 fib 1 = 1 fib n = fib (n-1) + fib (n-2) \end{code} Indeed, the both the *specification* and the *implementation* look pretty legit, so what gives? It turns out that there are *two* different reasons why. **Reason 1: Assumptions vs. Guarantees** What we really want to say over here is that the *input* `n` is non-negative. However, putting the refinement `n >= 0` in the *output* constraint means that it becomes something that LiquidHaskell checks that the function `fib` **guarantees** (or **ensures**). That is, the type states that we can pass `fib` *any* value `n` (including *negative* values) and yet, `fib` must return values `b` such that `b >= n` *and* `n >= 0`. The latter requirement is a rather tall order when an arbitrary `n` is passed in as input. `fib` can make no such guarantees since it was *given* the value `n` as a parameter. The only way `n` could be non-negative was that if the caller had sent in a non-negative value. Thus, we want to put the *burden of proof* on the right entity here, namely the caller. To assign the burden of proof appropriately, we place the non-negative refinement on the *input type* \begin{code} {-@ fib' :: n:{v:Int | v >= 0} -> {b:Int | (n >= 0 && b >= n) } @-} fib' :: Int -> Int fib' 0 = 0 fib' 1 = 1 fib' n = fib (n-1) + fib (n-2) \end{code} where now at *calls to* `fib'` LiquidHaskell will check that the argument is non-negative, and furthermore, when checking `fib'` LiquidHaskell will **assume** that the parameter `n` is indeed non-negative. So now the constraint `n >= 0` on the output is somewhat redundant, and the non-negative `n` guarantee holds trivially. **Reason 2: The Specification is a Fib** If you run the above in the demo, you will see that LiquidHaskell still doth protest loudly, and frankly, one might start getting a little frustrated at the stubbornness and petulance of the checker. \begin{code} However, if you stare at the implementation, you will see that it in fact, *does not* meet the specification, as fib' 2 == fib' 1 + fib' 0 == 0 + 1 == 1 \end{code} LiquidHaskell is reluctant to prove things that are false. Rather than anthropomorphize frivolously, lets see why it is unhappy. \begin{code}First, recall the somewhat simplified specification fib' :: n:Int -> { b:Int | (b >= n) } \end{code} As we saw in the discussion about `abz`, at each recursive callsite the *only information* LiquidHaskell uses about the returned value, is that described in the *output type* for that function call. \begin{code}Thus, LiquidHaskell reasons that the expression: fib' (n-1) + fib' (n-2) \end{code} \begin{code}has the type {b: Int | exists b1, b2. b == b1 + b2 && b1 >= n-1 && b2 >= n-2 } \end{code} where the `b1` and `b2` denote the values returned by the recursive calls --- we get the above by plugging the parameters `n-1` and `n-2` in for the parameter `n` in output type for `fib'`. \begin{code}The SMT solver simplifies the above to {b: Int | b >= 2n - 3} \end{code} \begin{code} Finally, to check the output guarantee is met, LiquidHaskell asks the SMT solver to prove that (b >= 2n - 2) => (b >= n) \end{code} The SMT solver will refuse, of course, since the above implication is *not valid* (e.g. when `n` is `2`) Thus, via SMT, LiquidHaskell proclaims that the function `fib'` does not implement the advertised type and hence marks it *unsafe*. Fixing The Code --------------- How then, do we get Chris' specification to work out? It seems like it *should* hold (except for that pesky case where `n=2`. Indeed, let's rig the code, so that all the base cases return `1`. \begin{code} {-@ fibOK :: n:Int -> {b:Int | ((b >= n) && (b >= 1))} @-} fibOK :: Int -> Int fibOK 0 = 1 fibOK 1 = 1 fibOK n = fibOK (n-1) + fibOK (n-2) \end{code} Here' we specify that not only is the output greater than the input, it is **also** greater than `1`. \begin{code} Now in the recursive case, LiquidHaskell reasons that the value being output is {b: Int | exists b1, b2. b == b1 + b2 && b1 >= n-1 && b1 >= 1 && b2 >= n-2 && b2 >= 1 } \end{code} \begin{code}which reduces to {b: Int | b = 2n - 3 && n >= 2 } \end{code} \begin{code}which, the SMT solver is happy to verify, is indeed a subtype of (i.e. implies the refinement of) the specified output {b: Int | b >= n && b >= 1 } \end{code} Conclusion ---------- There are several things to take away. 1. We need to distinguish between *assumptions* and *guarantees* when writing specifications for functions. 2. For *modularity*, LiquidHaskell, like every type system, uses only the (refinement) *type* of each function at each use site, and not the function's *body*. 3. Some seemingly intuitive specifications often aren't; in future work it would be useful to actually [generate][mlton] [tests][concolic] as [counterexamples][icse04] that illustrate when a specification [fails][dsd]. [qblog101]: /blog/2013/01/01/refinement-types-101.lhs/#comment-772807850 [qreddit101]: http://www.reddit.com/r/haskell/comments/16w3hp/liquidhaskell_refinement_types_in_haskell_via_smt/c809160 [ref101]: /blog/2013/01/01/refinement-types-101.lhs/ [concolic]: http://en.wikipedia.org/wiki/Concolic_testing [icse04]: http://goto.ucsd.edu/~rjhala/papers/generating_tests_from_counterexamples.html [dsd]: http://dl.acm.org/citation.cfm?doid=1348250.1348254 [mlton]: http://www.cs.purdue.edu/homes/zhu103/pubs/vmcai13.pdf