Feldspar.Core.Graph

feldspar-language-0.3.1: A functional embedded language for DSP and parallelism

Feldspar.Core.Graph

Description

A graph representation of core programs. A graph is a flat structure that can be viewed as a program with a global scope. For example, the Haskell program

 main x = f 1
   where
     f y = g 2
       where
         g z = x + z

might be represented by the following flat graph:

 graph = Graph
   { graphNodes =
       [ Node
           { nodeId     = 0
           , function   = Input
           , input      = Tup []
           , inputType  = Tup []
           , outputType = intType
           }
       , Node
           { nodeId     = 1
           , function   = Input
           , input      = Tup []
           , inputType  = Tup []
           , outputType = intType
           }
       , Node
           { nodeId     = 2
           , function   = Input
           , input      = Tup []
           , inputType  = Tup []
           , outputType = intType
           }
       , Node
           { nodeId     = 3
           , function   = Function "(+)"
           , input      = Tup [One (Variable (0,[])), One (Variable (2,[]))]
           , inputType  = intPairType
           , outputType = intType
           }
       , Node
           { nodeId     = 4
           , function   = NoInline "f" (Interface 1 (One (Variable (5,[]))) intType intType)
           , input      = One (Constant (IntData 1))
           , inputType  = intType
           , outputType = intType
           }
       , Node
           { nodeId     = 5
           , function   = NoInline "g" (Interface 2 (One (Variable (3,[]))) intType intType)
           , input      = One (Constant (IntData 2))
           , inputType  = intType
           , outputType = intType
           }
       ]

   , graphInterface = Interface
       { interfaceInput      = 0
       , interfaceOutput     = One (Variable (4,[]))
       , interfaceInputType  = intType
       , interfaceOutputType = intType
       }
   }
   where
     intType     = result (typeOf :: Res [[[Int]]] (Tuple StorableType))
     intPairType = result (typeOf :: Res (Int,Int) (Tuple StorableType))

XXX Check above code again

which corresponds to the following flat program

 main v0 = v4
 f v1    = v5
 g v2    = v3
 v3      = v0 + v2
 v4      = f 1
 v5      = g 2

There are a few assumptions on graphs:

All nodes have unique identifiers.
There are no cycles.
The input and inputType tuples of each node should have the same shape.
Each interfaceInput (including the top-level one) refers to an Input node not referred to by any other interface.
All Variable references are valid (i.e. refer only to those variables implicitly defined by each node).
There should not be any cycles in the constraints introduced by findLocalities. (XXX Is this even possible?)
Sub-function interfaces should be "consistent" with the input/output type of the node. For example, the body of a while loop should have the same type as the whole loop.

In the original program, g was defined locally to f, and the addition was done locally in g. But in the flat program, this hierarchy (called definition hierarchy) is not represented. The flat program is of course not valid Haskell (v0 and v2 are used outside of their scopes). The function makeHierarchical turns a flat graph into a hierarchical one that corresponds to syntactically valid Haskell.

makeHierarchical requires some explanation. First a few definitions:

Nodes that have associated interfaces (NoInline, IfThenElse, While and Parallel) are said to contain sub-functions. These nodes are called super nodes. In the above program, the super node v4 contains the sub-function f, and v5 contains the sub-function g.
A definition d is local to a definition e iff. d is placed somewhere within the definition of e (i.e. inside an arbitrarily deeply nested where clause).
A definition d is owned by a definition e iff. d is placed immediately under the top-most where clause of e. A definition may have at most one owner.

The definition hierarchy thus specifies ownership between the definitions in the program. There are two types of ownership:

A super node is always the owner of its sub-functions.
A sub-function may be the owner of some node definitions.

Assigning nodes to sub-functions in a useful way takes some work. It is done by first finding out for each node which sub-functions it must be local to. Each locality constraint gives an upper bound on where in the definition hierarchy the node may be placed. There is one principle for introducing a locality constraint:

If node v depends on the input of sub-function f, then v must be local to f.

The locality constraints for a graph can thus be found be tracing each sub-function input in order to find the nodes that depend on it (see function findLocalities). In the above program, we have the sub-functions f and g with the inputs v1 and v2 respectively. We can see immediately that no node depends on v1, so we get no locality constraints for f. The only node that depends on v2 is v3, so the program has a single locality constraint: v3 is local to g. Nodes without constraints are simply taken to be local to main. With this information, we can now rewrite the flat program as

 main v0 = v4
   where
     v4 = f 1
       where
         f v1 = v5
     v5 = g 2
       where
         g v2 = v3
           where
             v3 = v0 + v2

which is syntactically valid Haskell. Note that this program is slightly different from the original which defined g locally to f. However, in general, we want definitions to be as "global" as possible in order to maximize sharing. For example, we don't want to put definitions in the body of a while loop unless they really depend on the loop state, because then they will (probably, depending on implementation) be recomputed in every iteration. Also note that in this program, it is not strictly necessary to have the sub-functions owned by their super nodes -- f and g could have been owned by main instead. However, this would cause clashes if two sub-functions have the same name. Having sub-functions owned by their super nodes is also a way of keeping related definitions together in the program.

There is one caveat with the above method. Consider the following flat program:

 main v0 = v4
 f v1    = v5
 g v2    = v3
 v3      = v1 + 2
 v4      = f 0
 v5      = g 1

Here, we get the locality constraint: v3 is local to f. However, to get a valid definition hierarchy, we also need v5 to be local to f. This is because v5 is the owner of g, and the output of g is local to f. So when looking for dependencies, we should let each super node depend on its sub-function output, except for the owner of the very sub-function that is being traced (a function cannot be owned by itself).

Synopsis