language-c-0.4.7: Analysis and generation of C code

Copyright(c) [1999..2007] Manuel M T Chakravarty (c) 2008 Benedikt Huber
Safe HaskellNone




Abstract syntax of C source and header files.

The tree structure is based on the grammar in Appendix A of K&R. The abstract syntax simplifies the concrete syntax by merging similar concrete constructs into a single type of abstract tree structure: declarations are merged with structure declarations, parameter declarations and type names, and declarators are merged with abstract declarators.

With K&R we refer to ``The C Programming Language'', second edition, Brain W. Kernighan and Dennis M. Ritchie, Prentice Hall, 1988. The AST supports all of C99 and several GNU extensions


C translation units

type CTranslUnit = CTranslationUnit NodeInfo Source

Complete C tranlsation unit (C99 6.9, K&R A10)

A complete C translation unit, for example representing a C header or source file. It consists of a list of external (i.e. toplevel) declarations.

type CExtDecl = CExternalDeclaration NodeInfo Source

External C declaration (C99 6.9, K&R A10)

Either a toplevel declaration, function definition or external assembler.


type CFunDef = CFunctionDef NodeInfo Source

C function definition (C99 6.9.1, K&R A10.1)

A function definition is of the form CFunDef specifiers declarator decllist? stmt.

  • specifiers are the type and storage-class specifiers of the function. The only storage-class specifiers allowed are extern and static.
  • The declarator must be such that the declared identifier has function type. The return type shall be void or an object type other than array type.
  • The optional declaration list decllist is for old-style function declarations.
  • The statement stmt is a compound statement.

type CDecl = CDeclaration NodeInfo Source

C declarations (K&R A8, C99 6.7), including structure declarations, parameter declarations and type names.

A declaration is of the form CDecl specifiers init-declarator-list, where the form of the declarator list's elements depends on the kind of declaration:

1) Toplevel declarations (K&R A8, C99 6.7 declaration)

  • C99 requires that there is at least one specifier, though this is merely a syntactic restriction
  • at most one storage class specifier is allowed per declaration
  • the elements of the non-empty init-declarator-list are of the form (Just declr, init?, Nothing). The declarator declr has to be present and non-abstract and the initialization expression is optional.

2) Structure declarations (K&R A8.3, C99 struct-declaration)

Those are the declarations of a structure's members.

  • do not allow storage specifiers
  • in strict C99, the list of declarators has to be non-empty
  • the elements of init-declarator-list are either of the form (Just declr, Nothing, size?), representing a member with optional bit-field size, or of the form (Nothing, Nothing, Just size), for unnamed bitfields. declr has to be non-abstract.
  • no member of a structure shall have incomplete type

3) Parameter declarations (K&R A8.6.3, C99 6.7.5 parameter-declaration)

  • init-declarator-list must contain at most one triple of the form (Just declr, Nothing, Nothing), i.e. consist of a single declarator, which is allowed to be abstract (i.e. unnamed).

4) Type names (A8.8, C99 6.7.6)

  • do not allow storage specifiers
  • init-declarator-list must contain at most one triple of the form (Just declr, Nothing, Nothing). where declr is an abstract declarator (i.e. doesn't contain a declared identifier)

type CStructUnion = CStructureUnion NodeInfo Source

C structure or union specifiers (K&R A8.3, C99

CStruct tag identifier struct-decls c-attrs represents a struct or union specifier (depending on tag).

  • either identifier or the declaration list struct-decls (or both) have to be present.

    Example: in struct foo x;, the identifier is present, in struct { int y; } x the declaration list, and in struct foo { int y; } x; both of them.

  • c-attrs is a list of attribute annotations associated with the struct or union specifier

type CEnum = CEnumeration NodeInfo Source

C enumeration specifier (K&R A8.4, C99

CEnum identifier enumerator-list attrs represent as enum specifier

  • Either the identifier or the enumerator-list (or both) have to be present.
  • If enumerator-list is present, it has to be non-empty.
  • The enumerator list is of the form (enumeration-constant, enumeration-value?), where the latter is an optional constant integral expression.
  • attrs is a list of attribute annotations associated with the enumeration specifier

data CStructTag Source

A tag to determine wheter we refer to a struct or union, see CStructUnion.



Declaration attributes

type CDeclSpec = CDeclarationSpecifier NodeInfo Source

C declaration specifiers and qualifiers

Declaration specifiers include at most one storage-class specifier (C99 6.7.1), type specifiers (6.7.2) and type qualifiers (6.7.3).

partitionDeclSpecs :: [CDeclarationSpecifier a] -> ([CStorageSpecifier a], [CAttribute a], [CTypeQualifier a], [CTypeSpecifier a], Bool) Source

Separate the declaration specifiers

Note that inline isn't actually a type qualifier, but a function specifier. attribute of a declaration qualify declarations or declarators (but not types), and are therefore separated as well.

type CStorageSpec = CStorageSpecifier NodeInfo Source

C storage class specifier (and typedefs) (K&R A8.1, C99 6.7.1)

type CTypeSpec = CTypeSpecifier NodeInfo Source

C type specifier (K&R A8.2, C99 6.7.2)

Type specifiers are either basic types such as char or int, struct, union or enum specifiers or typedef names.

As a GNU extension, a typeof expression also is a type specifier.

isSUEDef :: CTypeSpecifier a -> Bool Source

returns True if the given typespec is a struct, union or enum definition

type CTypeQual = CTypeQualifier NodeInfo Source

C type qualifiers (K&R A8.2, C99 6.7.3), function specifiers (C99 6.7.4), and attributes.

const, volatile and restrict type qualifiers and inline function specifier. Additionally, attribute annotations for declarations and declarators.

type CAttr = CAttribute NodeInfo Source

attribute annotations

Those are of the form CAttr attribute-name attribute-parameters, and serve as generic properties of some syntax tree elements.


type CDeclr = CDeclarator NodeInfo Source

C declarator (K&R A8.5, C99 6.7.5) and abstract declarator (K&R A8.8, C99 6.7.6)

A declarator declares a single object, function, or type. It is always associated with a declaration (CDecl), which specifies the declaration's type and the additional storage qualifiers and attributes, which apply to the declared object.

A declarator is of the form CDeclr name? indirections asm-name? attrs _, where name is the name of the declared object (missing for abstract declarators), declquals is a set of additional declaration specifiers, asm-name is the optional assembler name and attributes is a set of attrs is a set of attribute annotations for the declared object.

indirections is a set of pointer, array and function declarators, which modify the type of the declared object as described below. If the declaration specifies the non-derived type T, and we have indirections = [D1, D2, ..., Dn] than the declared object has type (D1 indirect (D2 indirect ... (Dn indirect T))), where

  • (CPtrDeclr attrs) indirect T is attributed pointer to T
  • (CFunDeclr attrs) indirect T is attributed function returning T
  • (CArrayDeclr attrs) indirect T is attributed array of elemements of type T

Examples (simplified attributes):

  • x is an int
int x;
CDeclr "x" []
  • x is a restrict pointer to a const pointer to int
const int * const * restrict x;
CDeclr "x" [CPtrDeclr [restrict], CPtrDeclr [const]]
  • f is an function return a constant pointer to int
int* const f();
CDeclr "f" [CFunDeclr [],CPtrDeclr [const]]
  • f is a constant pointer to a function returning int
int (* const f)(); ==>
CDeclr "f" [CPtrDeclr [const], CFunDeclr []]

type CDerivedDeclr = CDerivedDeclarator NodeInfo Source

Derived declarators, see CDeclr

Indirections are qualified using type-qualifiers and generic attributes, and additionally

  • The size of an array is either a constant expression, variable length (*) or missing; in the last case, the type of the array is incomplete. The qualifier static is allowed for function arguments only, indicating that the supplied argument is an array of at least the given size.

    • New style parameter lists have the form Right (declarations, isVariadic), old style parameter lists have the form Left (parameter-names)

type CArrSize = CArraySize NodeInfo Source

Size of an array

data CDerivedDeclarator a Source


CPtrDeclr [CTypeQualifier a] a

Pointer declarator CPtrDeclr tyquals declr

CArrDeclr [CTypeQualifier a] (CArraySize a) a

Array declarator CArrDeclr declr tyquals size-expr?

CFunDeclr (Either [Ident] ([CDeclaration a], Bool)) [CAttribute a] a

Function declarator CFunDeclr declr (old-style-params | new-style-params) c-attrs

data CArraySize a Source


CNoArrSize Bool
CUnknownSize isCompleteType
CArrSize Bool (CExpression a)
CArrSize isStatic expr


type CInit = CInitializer NodeInfo Source

C initialization (K&R A8.7, C99 6.7.8)

Initializers are either assignment expressions or initializer lists (surrounded in curly braces), whose elements are themselves initializers, paired with an optional list of designators.

type CInitList = CInitializerList NodeInfo Source

Initializer List

The members of an initializer list are of the form (designator-list,initializer). The designator-list specifies one member of the compound type which is initialized. It is allowed to be empty - in this case the initializer refers to the ''next'' member of the compound type (see C99 6.7.8).

Examples (simplified expressions and identifiers):

-- int x[3][4] = { [0][3] = 4, [2] = 5, 8 };
--   corresponds to the assignments
-- x[0][3] = 4; x[2][0] = 5; x[2][1] = 8;
let init1 = ([CArrDesig 0, CArrDesig 3], CInitExpr 4)
    init2 = ([CArrDesig 2]             , CInitExpr 5)
    init3 = ([]                        , CInitExpr 8)
in  CInitList [init1, init2, init3]
-- struct { struct { int a[2]; int b[2]; int c[2]; } s; } x = { .s = { {2,3} , .c[0] = 1 } };
--   corresponds to the assignments
-- x.s.a[0] = 2; x.s.a[1] = 3; x.s.c[0] = 1;
let init_s_0 = CInitList [ ([], CInitExpr 2), ([], CInitExpr 3)]
    init_s   = CInitList [
                           ([], init_s_0),
                           ([CMemberDesig "c", CArrDesig 0], CInitExpr 1)
in  CInitList [(CMemberDesig "s", init_s)]

type CDesignator = CPartDesignator NodeInfo Source


A designator specifies a member of an object, either an element or range of an array, or the named member of a struct / union.

data CInitializer a Source


CInitExpr (CExpression a) a

assignment expression

CInitList (CInitializerList a) a

initialization list (see CInitList)

data CPartDesignator a Source


CArrDesig (CExpression a) a

array position designator

CMemberDesig Ident a

member designator

CRangeDesig (CExpression a) (CExpression a) a

array range designator CRangeDesig from to _ (GNU C)


type CStat = CStatement NodeInfo Source

C statement (K&R A9, C99 6.8)

type CBlockItem = CCompoundBlockItem NodeInfo Source

C99 Block items

Things that may appear in compound statements: either statements, declarations or nested function definitions.

type CAsmStmt = CAssemblyStatement NodeInfo Source

GNU Assembler statement

CAssemblyStatement type-qual? asm-expr out-ops in-ops clobbers _

is an inline assembler statement. The only type-qualifier (if any) allowed is volatile. asm-expr is the actual assembler epxression (a string), out-ops and in-ops are the input and output operands of the statement. clobbers is a list of registers which are clobbered when executing the assembler statement

type CAsmOperand = CAssemblyOperand NodeInfo Source

Assembler operand

CAsmOperand argName? constraintExpr arg specifies an operand for an assembler statement.

data CStatement a Source


CLabel Ident (CStatement a) [CAttribute a] a

An (attributed) label followed by a statement

CCase (CExpression a) (CStatement a) a

A statement of the form case expr : stmt

CCases (CExpression a) (CExpression a) (CStatement a) a

A case range of the form case lower ... upper : stmt

CDefault (CStatement a) a

The default case default : stmt

CExpr (Maybe (CExpression a)) a

A simple statement, that is in C: evaluating an expression with side-effects and discarding the result.

CCompound [Ident] [CCompoundBlockItem a] a

compound statement CCompound localLabels blockItems at

CIf (CExpression a) (CStatement a) (Maybe (CStatement a)) a

conditional statement CIf ifExpr thenStmt maybeElseStmt at

CSwitch (CExpression a) (CStatement a) a

switch statement CSwitch selectorExpr switchStmt, where switchStmt usually includes case, break and default statements

CWhile (CExpression a) (CStatement a) Bool a

while or do-while statement CWhile guard stmt isDoWhile at

CFor (Either (Maybe (CExpression a)) (CDeclaration a)) (Maybe (CExpression a)) (Maybe (CExpression a)) (CStatement a) a

for statement CFor init expr-2 expr-3 stmt, where init is either a declaration or initializing expression

CGoto Ident a

goto statement CGoto label

CGotoPtr (CExpression a) a

computed goto CGotoPtr labelExpr

CCont a

continue statement

CBreak a

break statement

CReturn (Maybe (CExpression a)) a

return statement CReturn returnExpr

CAsm (CAssemblyStatement a) a

assembly statement


type CExpr = CExpression NodeInfo Source

C expression (K&R A7)

  • these can be arbitrary expression, as the argument of sizeof can be arbitrary, even if appearing in a constant expression
  • GNU C extensions: alignof, __real, __imag, ({ stmt-expr }), && label and built-ins

data CExpression a Source


CComma [CExpression a] a 
CAssign CAssignOp (CExpression a) (CExpression a) a 
CCond (CExpression a) (Maybe (CExpression a)) (CExpression a) a 
CBinary CBinaryOp (CExpression a) (CExpression a) a 
CCast (CDeclaration a) (CExpression a) a 
CUnary CUnaryOp (CExpression a) a 
CSizeofExpr (CExpression a) a 
CSizeofType (CDeclaration a) a 
CAlignofExpr (CExpression a) a 
CAlignofType (CDeclaration a) a 
CComplexReal (CExpression a) a 
CComplexImag (CExpression a) a 
CIndex (CExpression a) (CExpression a) a 
CCall (CExpression a) [CExpression a] a 
CMember (CExpression a) Ident Bool a 
CVar Ident a 
CConst (CConstant a)

integer, character, floating point and string constants

CCompoundLit (CDeclaration a) (CInitializerList a) a

C99 compound literal

CStatExpr (CStatement a) a

GNU C compound statement as expr

CLabAddrExpr Ident a

GNU C address of label

CBuiltinExpr (CBuiltinThing a)

builtin expressions, see CBuiltin

data CBinaryOp Source

C binary operators (K&R A7.6-15)



remainder of division


shift left


shift right






less or equal


greater or equal




not equal


bitwise and


exclusive bitwise or


inclusive bitwise or


logical and


logical or

data CUnaryOp Source

C unary operator (K&R A7.3-4)



prefix increment operator


prefix decrement operator


postfix increment operator


postfix decrement operator


address operator


indirection operator


prefix plus


prefix minus


one's complement


logical negation

type CBuiltin = CBuiltinThing NodeInfo Source

GNU Builtins, which cannot be typed in C99


type CConst = CConstant NodeInfo Source

C constant (K&R A2.5 & A7.2)

type CStrLit = CStringLiteral NodeInfo Source

Attributed string literals

liftStrLit :: CStringLiteral a -> CConstant a Source

Lift a string literal to a C constant

Annoated type class

class Functor ast => Annotated ast where Source

All AST nodes are annotated. Inspired by the Annotated class of Niklas Broberg's haskell-src-exts package. In principle, we could have Copointed superclass instead of ann, for the price of another dependency.


annotation :: ast a -> a Source

get the annotation of an AST node

amap :: (a -> a) -> ast a -> ast a Source

change the annotation (non-recursively) of an AST node. Use fmap for recursively modifying the annotation.