A graph representation of core programs. A graph is a flat structure that
can be viewed as a program with a global scope. For example, the Haskell
program
main x = f 1
where
f y = g 2
where
g z = x + z
might be represented by the following flat graph:
graph = Graph
{ graphNodes =
[ Node
{ nodeId = 0
, function = Input
, input = Tup []
, inputType = Tup []
, outputType = intType
}
, Node
{ nodeId = 1
, function = Input
, input = Tup []
, inputType = Tup []
, outputType = intType
}
, Node
{ nodeId = 2
, function = Input
, input = Tup []
, inputType = Tup []
, outputType = intType
}
, Node
{ nodeId = 3
, function = Function "(+)"
, input = Tup [One (Variable (0,[])), One (Variable (2,[]))]
, inputType = intPairType
, outputType = intType
}
, Node
{ nodeId = 4
, function = NoInline "f" (Interface 1 (One (Variable (5,[]))) intType intType)
, input = One (Constant (IntData 1))
, inputType = intType
, outputType = intType
}
, Node
{ nodeId = 5
, function = NoInline "g" (Interface 2 (One (Variable (3,[]))) intType intType)
, input = One (Constant (IntData 2))
, inputType = intType
, outputType = intType
}
]
, graphInterface = Interface
{ interfaceInput = 0
, interfaceOutput = One (Variable (4,[]))
, interfaceInputType = intType
, interfaceOutputType = intType
}
}
where
intType = result (typeOf :: Res [[[Int]]] (Tuple StorableType))
intPairType = result (typeOf :: Res (Int,Int) (Tuple StorableType))
XXX Check above code again
which corresponds to the following flat program
main v0 = v4
f v1 = v5
g v2 = v3
v3 = v0 + v2
v4 = f 1
v5 = g 2
There are a few assumptions on graphs:
- All nodes have unique identifiers.
- There are no cycles.
- The input and inputType tuples of each node should have the same shape.
- Each interfaceInput (including the top-level one) refers to an Input
node not referred to by any other interface.
- All Variable references are valid (i.e. refer only to those variables
implicitly defined by each node).
- There should not be any cycles in the constraints introduced by
findLocalities. (XXX Is this even possible?)
- Sub-function interfaces should be "consistent" with the input/output type
of the node. For example, the body of a while loop should have the same type
as the whole loop.
In the original program, g was defined locally to f, and the addition was
done locally in g. But in the flat program, this hierarchy (called
definition hierarchy) is not represented. The flat program is of course not
valid Haskell (v0 and v2 are used outside of their scopes). The function
makeHierarchical turns a flat graph into a hierarchical one that
corresponds to syntactically valid Haskell.
makeHierarchical requires some explanation. First a few definitions:
- Nodes that have associated interfaces (NoInline, IfThenElse, While
and Parallel) are said to contain sub-functions. These nodes are called
super nodes. In the above program, the super node v4 contains the
sub-function f, and v5 contains the sub-function g.
- A definition d is local to a definition e iff. d is placed
somewhere within the definition of e (i.e. inside an arbitrarily deeply
nested where clause).
- A definition d is owned by a definition e iff. d is placed
immediately under the top-most where clause of e. A definition may have
at most one owner.
The definition hierarchy thus specifies ownership between the definitions in
the program. There are two types of ownership:
- A super node is always the owner of its sub-functions.
- A sub-function may be the owner of some node definitions.
Assigning nodes to sub-functions in a useful way takes some work. It is done
by first finding out for each node which sub-functions it must be local to.
Each locality constraint gives an upper bound on where in the definition
hierarchy the node may be placed. There is one principle for introducing a
locality constraint:
- If node v depends on the input of sub-function f, then v must be
local to f.
The locality constraints for a graph can thus be found be tracing each
sub-function input in order to find the nodes that depend on it (see function
findLocalities). In the above program, we have the sub-functions f and
g with the inputs v1 and v2 respectively. We can see immediately that
no node depends on v1, so we get no locality constraints for f. The only
node that depends on v2 is v3, so the program has a single locality
constraint: v3 is local to g. Nodes without constraints are simply taken
to be local to main. With this information, we can now rewrite the flat
program as
main v0 = v4
where
v4 = f 1
where
f v1 = v5
v5 = g 2
where
g v2 = v3
where
v3 = v0 + v2
which is syntactically valid Haskell. Note that this program is slightly
different from the original which defined g locally to f. However, in
general, we want definitions to be as "global" as possible in order to
maximize sharing. For example, we don't want to put definitions in the body
of a while loop unless they really depend on the loop state, because then
they will (probably, depending on implementation) be recomputed in every
iteration. Also note that in this program, it is not strictly necessary to
have the sub-functions owned by their super nodes -- f and g could have
been owned by main instead. However, this would cause clashes if two
sub-functions have the same name. Having sub-functions owned by their super
nodes is also a way of keeping related definitions together in the program.
There is one caveat with the above method. Consider the following flat
program:
main v0 = v4
f v1 = v5
g v2 = v3
v3 = v1 + 2
v4 = f 0
v5 = g 1
Here, we get the locality constraint: v3 is local to f. However, to get a
valid definition hierarchy, we also need v5 to be local to f. This is
because v5 is the owner of g, and the output of g is local to f. So
when looking for dependencies, we should let each super node depend on its
sub-function output, except for the owner of the very sub-function that is
being traced (a function cannot be owned by itself).
|