This project defines a concrete syntax for a certain subset of the
registerized Crucible CFGs.
Some features are intentionally omitted, because they require
compile-time additions to Crucible in the form of type class
instances. In particular, there is no syntax for:
* Recursive types
* Extensions
* Concrete types
How to use
General syntax
The basic syntax is based on a simplified variant of Lisp
S-expressions, without support for dotted pairs or special syntax for
quote or quasiquote. A syntactic form is either an atom or matching
opening and closing parentheses with a whitespace-delimited sequence
of syntactic forms between them.
The atoms are as follows:
* Identifiers are either keywords or Crucible atom names. Every
identifier that is not a language keyword is a Crucible atom
name. Identifiers consist of a letter-like character followed by
zero or more digits or letter-like characters. Letter-like
characters are those considered letters by Unicode, or any of the
characters <, >, =, +, -, *, /, !, _, \, or ?.
The keywords are documented below, under each special form.
* Function names consist of an @ character followed by an identifier.
* Register names consist of a $ character followed by an identifier.
* Numbers consist of an optional '+' or '-' followed by an unsigned
number and an optional denominator. Unsigned numbers are either
decimal literals, octal literals, or hexadecimal literals, using
the typical syntax with a 0-prefix. A denominator is a '/'
character followed by an unsigned number.
* Boolean literals are #t or #T and #f or #F.
* String literals are delimited by double-quotes, and support
escaping with \.
Line comments are preceded by ;, and block comments are delimited by
#| and |#.
Functions
A program consists of a sequence of function definitions. A function
definition is a form that begins with the keyword "defun", followed by
a function name, argument list, return type, and body. A function name
is a function name atom. An argument list is a form that contains zero
or more argument specs. An argument spec is a two-element form, where
the first is a Crucible atom name, and the second is a form that
denotes a type. A return type is a form that denotes a type.
A function body consists of an optional list of registers, a start
block, and zero or more further blocks. A list of registers is a form
that begins with the keyword "registers" and is followed by zero or
more register specifications. A register specification is a form
containing an atom name and a type.
Blocks consist of the defblock keyword followed by a block body. Block
bodies are zero or more ordinary statements followed by a terminating
statement. The first block must begin with "start" instead of
"defblock". In the future, the restriction that the start block comes
first may be relaxed.
Forward declarations
A forward declaration is a form that begins with the keyword "declare",
followed by a function name, argument list, and return type. A forward
declaration is like a function but without the function body. Forward
declarations are useful in situations where you do not know the definition
of a function ahead of time, but you will know it at some point after parsing
the program. It is the responsibility of the client to ensure that forward
declarations are resolved to Crucible definitions before being invoked.
Global variables
A global variable is a form that begins with the keyword "defglobal", followed
by an identifier prefixed with two dollar signs (e.g., $$global-name) as well
as a type. A global variable is a mutable reference that scopes over all of the
functions defined in the program. The value of a global variable can be set
with the "set-global!" form.
A program can reference a global variable defined externally by using an extern
declaration. An extern declaration is exactly like a "defglobal" declaration,
but using the "extern" keyword instead of "defglobal". The difference between
an extern and a normal global variable is that the value of an extern may
already have been set by the time that the .cbl file which declares the extern
uses it. It is the responsibility of the client to ensure that externs are
inserted into the Crucible symbolic global state before being accessed.
Types
si ::= 'Unicode' | 'Char16' | 'Char8'
fi ::= 'Half' | 'Float' | 'Double' | 'Quad'
| 'X86_80' | 'DoubleDouble'
t ::= 'Any' | 'Unit' | 'Bool' | 'Nat' | 'Integer' | 'Real'
| 'ComplexReal' | 'Char' | '(' 'String' si ')'
| '(' 'FP' fi ')' | '(' 'BitVector' n ')'
| '(' '->' t ... t ')' | '(' 'Maybe' t ')'
| '(' 'Sequence' t ')' | '(' 'Vector' t ')' | '(' 'Ref' t ')'
| '(' 'Struct' t ... t ')' | '(' 'Variant' t ... t ')'
Expressions
Registers
Blocks
Statements