The Cryptographic Protocol Shapes Analyzer (CPSA) attempts to enumerate all essentially different executions possible for a cryptographic protocol. We call them the shapes of the protocol. Naturally occurring protocols have only finitely many, indeed very few shapes. Authentication and secrecy properties are easy to determine from them, as are attacks and anomalies.
For each input problem, the CPSA program is given some initial behavior, and it discovers what shapes are compatible with it. Normally, the initial behavior is from the point of view of one participant. The analysis reveals what the other participants must have done, given the participant's view. The search is based on a high-level algorithm that was claimed to be complete, i.e. every shape can in fact be found in a finite number of steps. Further theoretical work showed classes of executions that are not found by the algorithm, however it also showed that every omitted execution requires an unnatural interpretation of a protocol's roles. Hence the algorithm is complete relative to natural role semantics.
The analyzer is designed to work well with other tools. S-expressions are
used for both input and output. The analyzer reads all the problems in its
input, writes out the solution to each problem, and then exits. This release
contains seven tools. The cpsagraph
program
provides a visualization of answers using Scalable Vector Graphics (SVG). The cpsashapes
program removes intermediate results
from analyzer runs making the shapes easy to identify.
The cpsadiff
program compares CPSA output
files S-expression by S-expression, and prints the first skeleton
that differs.
The cpsalogic
program logical
formula that can be used to ensure security goals are achieved.
The cpsaannotations
program uses protocol
annotations to annotate shapes and generate protocol soundness obligations. The
cpsaparameters
program detects some
specification errors by performing a data flow analysis on protocol roles. The
cpsapp
program pretty prints its input using a
CPSA specific algorithm.
The input syntax is essentially the same as the output syntax. A Lisp aware editor will pretty print input, and the output is pretty printed. We use Emacs' Scheme mode to prepare input. This document describes the syntax, but provides little assistance for its interpretation. See the CPSA Tutorial document for this information.
The typical pattern of usage is to enter the set of problems to be analyzed
into a file, in this example, prob.scm
, analyze the problems, and if
something interesting is produced, visualize the answers.
$ cpsa +RTS -M512m -RTS -o prob.txt prob.scm $ cpsagraph -o prob.xhtml prob.txt $ firefox -remote "openFile(`pwd`/prob.xhtml)"
Often a summary of the analysis is more enlightening.
$ cpsashapes -o prob_shapes.txt prob.txt $ cpsagraph -o prob_shapes.xhtml prob_shapes.txt $ firefox -remote "openFile(`pwd`/prob_shapes.xhtml)"
The distribution comes with the
file cpsa.mk
for inclusion into your
makefile. A sample makefile follows. If you cut-and-paste from a
browser window, be sure to convert the leading spaces in the last line
into a tab character.
CPSAFLAGS = +RTS -M512m -RTS SRCS := $(wildcard *.scm) $(wildcard *.lsp) include cpsa.mk all: $(SRCS:%.scm=%_shapes.xhtml) $(SRCS:%.scm=%.xhtml) \ $(SRCS:%.lsp=%_shapes.xhtml) $(SRCS:%.lsp=%.xhtml) clean: -rm *.txt *.xhtml
The CPSA program
is Emacs friendly.
If you run the above makefile
via "M-x
compile", the results will be displayed in a buffer in
Compilation Mode. The command "C-x `" will visit
the locus of the next error message or match (next-error
)
in your CPSA input file.
For platforms without GNU make
, the Haskell program Make.hs
can be loaded into a Haskell interpreter
and perform a similar function. Users are expected to copy the makefile or the
Haskell program into the directory containing a set of analysis problems, and
then modify the file so it specifies the command line flags appropriate for the
problems in the directory.
The syntax is extensible. Association lists in which the key is always a
symbol are allowed at various points in the grammar. Key-value pairs with
unrecognized keys are ignored, and are available for use by other tools. On
output, unrecognized key-value pairs are preserved when printing protocols, but
elided when printing skeletons, with the exception of the comment
key.
The input is an option herald form followed by a sequence of protocol definitions and problem statements. A herald form allows options normally specified on the command line to be specified within an input file. A problem statement describes the initial behavior as a skeleton. The grammar for a protocol follows.
PROTOCOL ::= (defprotocol ID ALG ROLE+ PROT-ALIST) ID ::= SYMBOL ALG ::= SYMBOL ROLE ::= (defrole ID VARS TRACE ROLE-ALIST) VARS ::= (vars DECL*) DECL ::= (ID+ SORT) TRACE ::= (trace EVENT+) EVENT ::= (send TERM) | (recv TERM) ROLE-ALIST ::= (non-orig POS-TERM*) ROLE-ALIST | (pen-non-orig POS-TERM*) ROLE-ALIST | (uniq-orig TERM*) ROLE-ALIST | ... | (priority NODE-INT*) SKEL-ALIST POS-TERM ::= TERM | (INT TERM) PROT-ALIST ::= ...
The protocol ID
is a symbol that names the protocol, and the role
ID
is a symbol that names the role. The ALG
symbol identifies
the algebra used by the protocol. For the Basic Cryptographic Algebra, the
symbol is basic
. The var
form contains symbols that
declare the sort of the variables used in this role. The set of sort
symbols is algebra specific. The protocol association list has no
predefined keys, while the role association list has three. The value
associated with both of these keys must be atoms in the
algebra. Each non-orig
term must not be carried by any event
in the role's trace, but each of its variables must occur in some
term. A role non-origination assumption of the form (3 a)
asserts that atom a
will not be mapped into an instance
unless its height is greater than three. For each pen-non-orig
term, each of its variables must occur in some term, but unlike a
non-origination assumption, the term may be carried.
Each uniq-orig
term must originate in the role's trace. Every
non-atomic variable must be acquired by the role's trace.
The structure of sorts and terms in the Basic Cryptographic Algebra follows.
SORT ::= text | data | name | skey | akey | mesg TERM ::= ID | STRING | (cat TERM+) | (enc TERM+ TERM) | (hash TERM+) | (pubk ID) | (privk ID) | (invk ID) | (ltk ID ID) | (pubk STRING ID) | (privk STRING ID)
The form (cat
a b c d e) is expands to (cat
a (cat
b (cat
c (cat
d e)))).
A term in the Basic Crypto Algebra is an atom if it is variable of a sort
other than mesg
, or it if formed from one of the following operations:
pubk
, privk
, invk
, and ltk
.
SKELETON ::= (defskeleton ID VARS STRAND+ SKEL-ALIST) STRAND ::= (defstrand ID INT MAPLET*) | (deflistener TERM) MAPLET ::= (TERM TERM) SKEL-ALIST ::= (non-orig TERM*) SKEL-ALIST | (pen-non-orig TERM*) SKEL-ALIST | (uniq-orig TERM*) SKEL-ALIST | (precedes NODE-PAIR*) SKEL-ALIST | (priority NODE-INT*) SKEL-ALIST | ... NODE-PAIR ::= (NODE NODE) NODE ::= (INT INT)
The ID
in the skeleton form names a protocol. It refers to the last
protocol definition of that name which precedes the skeleton form. The
ID
in the strand form names a role. The integer in the strand form
gives the height of the strand. The sequence of pairs of terms in the strand
form specify an environment used to construct the messages in a strand from its
role's trace. The first term is interpreted using the role's variables and the
second term uses the skeleton's variables. The environment used to produce the
strand's trace is derived by matching the second term using the first term as a
pattern.
The first integer in a node identifies a strand, and the second one specifies the position of the node in the strand. Zero-based indexing is used. Zero identifies the first strand, and the first node in a strand has position zero.
A variable may occur in more then one role within a protocol. The
reader performs a renaming so as to ensure these occurrences do not
overlap. Furthermore, the maplets used to specify a strand need not
specify how to map every role variable. The reader inserts missing
mappings, and renames every skeleton variable that also occurs in a
role of its protocol. The sort of every skeleton variable that occurs
in the non-orig
, pen-non-orig
, or uniq-orig
list or in a maplet must be declared.
On output, key-value pairs are added to all skeleton's association list.
Every skeleton in the output is labeled with a unique identifier with
(label INT)
A skeleton has (parent INT)
if it is a member of
the cohort of the identified parent. A skeleton has (seen INT+)
when
members of its cohort are isomorphic to previously seen skeletons. A skeleton
lists its unrealized nodes with (unrealized NODE*)
. The traces
associated with each strand is given by the (traces ...)
form.
Finally, the operation used to derive this skeleton from its parent is recorded
with (operation TEST KIND TERM NODE TERM*)
, where TEST
is the
authentication test encryption-test
or nonce
, KIND
is (added-strand ID INT)
, (contracted MAPLET*)
, or
(added-listener TERM)
, TERM
is the critical term,
NODE
in the test node, and the remaining terms specify the escape set.
When the operation kind is augmenting, the instance's role name and height are
provided. For kind listening, a term is provided. For kind contracting, the
substitution is provided. When a substitution refers to a variable not in the
skeleton, its name is unpredictable. For generalization, the operation is
recorded as (operation generalization METHOD)
, where METHOD
is one of deleted NODE
, weakened NODE-PAIR
, separated
TERM
and forgot TERM
. Generalization is used to find shapes from
realized skeletons. For shape collapsing, the operation is recorded as
(operation collapsed INT INT)
, where the two INT
s identify
the strands merged. Shape collapsing is used to find related shapes.
After reading the input, cpsa
expands macros before in analyzing
the results. A macro definition is a top-level form.
(defmacro (NAME ARG*) BODY)
The cpsa
program expands all calls to macros in non-macro defining
S-expressions using the macros that have been defined previously. A macro
definition can be used to expand a call if the first element of a list matches
the name of the macro, and the length of the remaining elements in the list
matches the length of the macro's argument list. When two macros definitions
are applicable, the last definition takes precedence. The cpsa
program
omits macro definitions from its output.
A source file can be included within another with the top-level include form,
(include FILE)
where FILE
is a string. The cpsa
program
includes files while it does macro expansion.
$ cpsa -h Usage: cpsa [OPTIONS] [FILE] -o FILE --output=FILE output FILE -l INT --limit=INT step count limit (default 2000) -b INT --bound=INT strand count bound (default 8) -m INT --margin=INT set output margin (default 72) -e --expand expand macros only; don't analyze -n --noisochk disable isomorphism checks -c --check-nonces check nonces first -t --try-old-strands try old strands first -r --reverse-nodes try younger nodes first -a STRING --algebra=STRING algebra (default basic) -s --show-algebras show algebras -h --help show help message -v --version show version number
This program will abort if too many steps are taken. A skeleton is printed
for each step taken by the program. The step count limit option is used to
override the default step count limit. It will also abort when it detects a
skeleton with too many strands. The strand count bound option is used to
override the default strand count bound. Another way to limit resources used by
the program is to limit the amount of memory it may use. The command-line
option +RTS -M512m -RTS
limits memory usage to 512m.
The herald form can be used to specify options within a file. The
grammar for a herald form follows, where ALIST
is an
association list with symbols as keys.
HERALD ::= (herald TITLE ALIST) TITLE ::= SYMBOL | STRING
The title is used by cpsagraph
when generating XHTML.
In the following example, the herald form specifies a strand bound
of 12 in a way that is equivalent to the command line
option --bound=12
.
(herald "Needham-Schroeder Public-Key Protocol" (bound 12))
The herald form is only interpreted by the cpsa
program,
and is treated as a comment by all other programs. In particular,
specifying a margin
option effects cpsa
only.
For long running problems, SMP parallelism is available. For example, on a
quad-core machine, we would probably use +RTS -N4 -RTS
.
In addition to the options provided on the command line or a herald
form, one can influence the order in which test nodes are sought.
When a role includes (reverse-search)
in its association
list as a comment, the nodes in its instances will be searched in
reverse order.
When run with isomorphism checks disabled (--noisochk
), CPSA
searches for realized skeletons, not shapes. It attempts to speed up analysis
by not identifying duplicate skeletons or generalizing realized skeletons,
however in many cases, run times increase. It is used when normal analysis
takes too much time on the chance that avoiding the checks makes more progress.
In no isomorphism checking mode output, every realized skeleton is labeled a
shape even when it is not. This allows the extraction of every realized
skeletons from the output using the cpsashapes
program.
An error message that begins with "No test for unrealized node" identifies a severe error that should be reported as a bug.
In its analysis, CPSA will attempt to solve unrealized nodes until the skeleton is realized, and then generalize until a shape is obtained. When there are multiple unrealized nodes, by default, CPSA will chose the topmost, rightmost unrealized node among those unrealized nodes with maximum priority. The default priority of a node is 5, and higher numbers represent a higher priority. When a node has non-positive priority, it is left unsolved.
In a role, in order to specify a priority for an event other than
the default, add a comment of the
form (priority (i p))
,
where i is the index of the event in the trace,
and p is the priority.
Priorities are inherited
from roles, but may be overridden in a skeleton with a
priority
comment of the form
(priority (n p))
,
where n is a node.
The cpsagraph
program produces a graphical rendering of the output
or input of an analyzer using SVG. It is viewable only with a
standards-compliant web browser such as FireFox.
When applied to the output of CPSA, the program groups together all the skeletons related to each problem statement in the input. The program assembles the related skeletons into a directed acyclic graph using the parent identifier and renders it as a tree using SVG. Each vertex in the tree is the label added to the skeleton by the CPSA program. Click the label in the tree to view the skeleton. Hover over a node in a skeleton drawing to see the term associated with it. Hover over a role to see the mapping from role variables to skeleton variables.
A node ordering edge in a skeleton drawing indicates that the message
transmission at the source of the edge happened before message reception at the
destination edge. The edge is solid if the transmitting term is syntactically
identical to the receiving term, otherwise the edge is dashed. Thus in an
algebra with a commutative operation *
, the graphing program sometimes
draws a dashed line between (send (* a b))
and (recv (* b
a))
.
In the tree drawing, the label of a shape is blue. For dead skeletons, the label is red, unless it has been seen before, in which case it is orange. Otherwise the label is green for skeletons seen before. Seen skeleton labels are rendered in an italic font.
By the default, cpsagraph
generates a view of CPSA S-expressions as
a compound document that contains SVG within XHTML. This view integrates
graphics with the input text.
In compact mode, cpsagraph
generates an SVG document. The tree is
displayed immediately to the left of skeleton drawings. When there is more than
one tree, the left-hand-side of the drawing contains a vertical listing of the
trees. Compact mode output should never be used as a replacement for studying
the text version of the output. The text version contains strictly more
information, and should be displayed next to its graphical rendition.
In LaTeX mode, cpsagraph
generates LaTeX source. XY-pic is used for
drawings of skeletons. The margin specified in cpsa.mk
produces good results.
$ cpsagraph -h Usage: cpsagraph [OPTIONS] [FILE] -o FILE --output=FILE output FILE -x --expanded use expanded format (default) -z --zoom enable diagram scaling -t --treeless use treeless expanded format -c --compact use compact format -l --latex use LaTeX format -m INT --margin=INT set output margin (default 72) -i --infix output uses infix notation -h --help show help message -v --version show version number
The cpsadiff
program compares CPSA output files
S-expression by S-expression, and prints the first skeleton that
differs.
$ cpsadiff -h Usage: cpsadiff [OPTIONS] OLD-FILE NEW-FILE -o FILE --output=FILE output FILE -m INT --margin=INT set output margin (default 72) -h --help show help message -v --version show version number
The cpsashapes
program extracts the original problems and the
shapes from the output of a CPSA run. The shapes are linked to their problem so
the output can be graphed.
$ cpsashapes -h Usage: cpsashapes [OPTIONS] [FILE] -o FILE --output=FILE output FILE -m INT --margin=INT set output margin (default 72) -h --help show help message -v --version show version number
The cpsalogic
program extracts a formula in the language
of order-sorted first-order logic for each problem and its shapes
from a CPSA run. The formula is called a shape analysis
sentence. The formula is satisfied in all realized skeletons when
CPSA finds all the shapes for the problem.
$ cpsalogic -h Usage: cpsalogic [OPTIONS] [FILE] -o FILE --output=FILE output FILE -m INT --margin=INT set output margin (default 72) -a STRING --algebra=STRING algebra (default basic) -s --show-algebras show algebras -h --help show help message -v --version show version number
The cpsaannotations
program uses protocol annotations to annotate
shapes and generate protocol soundness obligations.
$ cpsaannotations -h Usage: cpsashapes [OPTIONS] [FILE] -o FILE --output=FILE output FILE -m INT --margin=INT set output margin (default 72) -a STRING --algebra=STRING algebra (default basic) -s --show-algebras show algebras -h --help show help message -v --version show version number
To be analyzed, each role in a protocol must include an annotations
form. The TERM
is a role atom that when instantiated, is the name of a
principal in the shape. What follows is sequences of pairs. The integer gives
the position of the event in the trace that is annotated by the formula.
Zero-based indexing is used.
The language of formulas is order-sorted first-order logic extended with a modal says operator. Formula terms may include function symbols that are not part of a protocol's message signature.
ANNOTATION ::= (annotations TERM (INT FORMULA)*) FORMULA ::= (ID FTERM*) | (not FORMULA) | (and FORMULA*) | (or FORMULA*) | (implies FORMULA* FORMULA) | (iff FORMULA FORMULA) | (says TERM FORMULA) | (forall (DECL*) FORMULA) | (exists (DECL*) FORMULA) FTERM ::= TERM | (ID FTERM*)
Use (and)
for truth and (or)
for falsehood.
On output, each shape contains an annotations
form and a
obligations
form. The annotations form presents every non-trivial
formula derived from the protocol. The obligations form presents every
non-trivial soundness obligation.
The parameters of a role are the atoms that are not acquired by the role's trace, but must be available before a complete execution of the trace is possible. This program computes sets of sets of parameters consistent with the role. If the expected parameter set is not a member, a specification error is indicated.
$ cpsaparameters -h Usage: cpsaparameters [OPTIONS] [FILE] -o FILE --output=FILE output FILE -m INT --margin=INT set output margin (default 72) -a STRING --algebra=STRING algebra (default basic) -s --show-algebras show algebras -h --help show help message -v --version show version number
On output, each role contains a parameters
form with a list of sets
of atoms.
(defrole resp (vars (b a name) (n2 n1 text)) (trace (recv (enc n1 a (pubk b))) (send (enc n1 n2 (pubk a))) (recv (enc n2 (pubk b)))) (parameters (n2 n1 a (pubk b) (pubk a)) (n2 (privk b) (pubk a))))
In this example, the second parameter set is what is expected for a responder role in this version of the Needham-Schroeder protocol.
Macro expansion is not performed by this program. Use the -e
option
with cpsa
to preprocess input that contains macro definitions.
The cpsapp
program program pretty prints its input using the CPSA
specific algorithm.
$ cpsapp -h Usage: cpsapp [OPTIONS] [FILE] -o FILE --output=FILE output FILE -m INT --margin=INT set output margin (default 72) -i --infix output uses infix notation -j --json output uses JSON notation -h --help show help message -v --version show version number
The S-expressions used are restricted so that most dialects of Lisp
can read them, and characters within symbols and strings never need
quoting. Every list is proper. An atom is either a symbol, an integer,
or a string. The characters that make up a symbol are the letters, the
digits, and the special characters in
"-*/<=>!?:$%_&~^
". A symbol may not begin with a
digit or a sign followed by a digit. The characters that make up a
string are the printing characters omitting double quote and
backslash. Double quotes delimit a string. A comment begins with a
semicolon, or is an S-expression list at top-level that starts with
the comment
symbol.
Copyright (c) 2009 The MITRE Corporation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, this copyright notice and the title of the publication and its date appear, and notice in given that copying is by permission of The MITRE Corporation.