crucible-llvm-0.6: Support for translating and executing LLVM code in Crucible
Copyright(c) Galois Inc 2022
LicenseBSD3
Stabilityprovisional
Safe HaskellSafe-Inferred
LanguageHaskell2010

Lang.Crucible.LLVM.SimpleLoopInvariant

Description

This module provides an execution feature that can be installed into the Crucible simulator which facilitates reasoning about certain kinds of loops by using loop invariants instead of requiring that loops symbolically terminate. In order for this feature to work, the loop in question needs to be single-entry/single-exit, and needs to have a constant memory footprint on each loop iteration (except that memory regions backed by SMT arrays are treated as a whole, so loops can write into different regions of an SMT-array memory region on different iterations). In addition, loop-involved memory writes must be sufficiently concrete that we can determine their region values, and writes to the same region value must have concrete distances from each other, so we can determine if/when they alias.

To set up a loop invariant for a loop, you must specify which CFG the loop is in, indicate which loop (of potentially several) in the CFG is the one of interest, and give a function that is used to construct the statement of the loop invariant. When given a CFG, the execution feature computes a weak topological ordering to find the loops in the program; the number given by the user selects which of these to install the invariant for.

At runtime, we will interrupt execution when the loop head is reached; at this point we will record the values of the memory and the incoming local variables. Then, we will begin a series of "hypothetical" executions of the loop body and track how the memory and local variables are modified by the loop body. On each iteration where we find a difference, we replace the local or memory region with a fresh "join variable" which represents the unknown value of a loop-carried dependency. We continue this process until we reach a fixpoint; then we will have captured all the locations that are potentially of interest for the loop invariant.

Once we have found all the loop-carried dependencies, we assert that the loop invariant holds on the initial values upon entry to the loop. Then, we set up another execution starting from the loop head where we first assume the loop invariant over the join variables invented earlier, and begin execution again. In this mode, when we reach the loop head once more, we assert the loop invariant on the computed values and abort execution along that path. Paths exiting the loop continue as normal.

Provided the user suppiles an appropriate loop invarant function and can discharge all the generated proof obligations, this procedure should result in a sound proof of partial correctness for the function in question.

This whole procedure has some relatively fragile elements that are worth calling out. First, specifying which loop you want to reason about may require some trial-and-error, the WTO ordering might not directly correspond to what is seen in the source code. The most reliable way to select the right loop is to ensure there is only one loop of interest in a given function, and use loop index 0. The other fragility has to do with the discovery of loop-carried dependencies. The number and order of values that are supplied to the loop invariant depend on the internal details of the compiler and simulator, so the user may have to spend some time and effort to discover what the values appearing in the invariant correspond to. This process may well be quite sensitive to changes in the source code.

Limitiations: currently, this feature is restricted to 64-bit code.

Synopsis

Documentation

data InvariantEntry sym tp Source #

When live loop-carried dependencies are discovered as we traverse a loop body, new "widening" variables are introduced to stand in for those locations. When we introduce such a varible, we capture what value the variable had when we entered the loop (the "header" value); this is essentially the initial value of the variable. We also compute what value the variable should take on its next iteration assuming the loop doesn't exit and executes along its backedge. This "body" value will be computed in terms of the the set of all discovered live variables so far. We know we have reached fixpoint when we don't need to introduce any more fresh widening variables, and the body values for each variable are stable across iterations.

Constructors

InvariantEntry 

Fields

Instances

Instances details
OrdF (InvariantEntry sym) => TestEquality (InvariantEntry sym :: BaseType -> Type) Source # 
Instance details

Defined in Lang.Crucible.LLVM.SimpleLoopInvariant

Methods

testEquality :: forall (a :: k) (b :: k). InvariantEntry sym a -> InvariantEntry sym b -> Maybe (a :~: b) #

OrdF (SymExpr sym) => OrdF (InvariantEntry sym :: BaseType -> Type) Source # 
Instance details

Defined in Lang.Crucible.LLVM.SimpleLoopInvariant

Methods

compareF :: forall (x :: k) (y :: k). InvariantEntry sym x -> InvariantEntry sym y -> OrderingF x y #

leqF :: forall (x :: k) (y :: k). InvariantEntry sym x -> InvariantEntry sym y -> Bool #

ltF :: forall (x :: k) (y :: k). InvariantEntry sym x -> InvariantEntry sym y -> Bool #

geqF :: forall (x :: k) (y :: k). InvariantEntry sym x -> InvariantEntry sym y -> Bool #

gtF :: forall (x :: k) (y :: k). InvariantEntry sym x -> InvariantEntry sym y -> Bool #

simpleLoopInvariant Source #

Arguments

:: forall sym ext p rtp blocks init ret. (IsSymInterface sym, IsSyntaxExtension ext, HasLLVMAnn sym, ?memOpts :: MemOptions) 
=> sym 
-> Integer

which of the discovered loop heads to install a loop invariant onto

-> CFG ext blocks init ret

The function we want to verify

-> GlobalVar Mem

global variable representing memory

-> (InvariantPhase -> [Some (SymExpr sym)] -> MapF (SymExpr sym) (InvariantEntry sym) -> IO (Pred sym)) 
-> IO (ExecutionFeature p sym ext rtp) 

This execution feature is designed to allow a limited form of verification for programs with unbounded looping structures.

It is currently highly experimental and has many limititations. Most notably, it only really works properly for functions consisting of a single, non-nested loop with a single exit point. Moreover, the loop must have an indexing variable that counts up from a starting point by a fixed stride amount.

Currently, these assumptions about the loop structure are not checked.

The basic use case here is for verifying functions that loop through an array of data of symbolic length. This is done by providing a "fixpoint function" which describes how the live values in the loop at an arbitrary iteration are used to compute the final values of those variables before execution leaves the loop. The number and order of these variables depends on internal details of the representation, so is relatively fragile.