Portability | GHC only? |
---|---|
Stability | experimental |
Maintainer | bjorn.buckwalter@gmail.com |
Safe Haskell | Safe-Inferred |
Forward Automatic Differentiation via overloading to perform nonstandard interpretation that replaces original numeric type with corresponding generalized dual number type.
Credits:
Authors: Copyright 2008, Barak A. Pearlmutter (barak@cs.nuim.ie) & Jeffrey Mark Siskind (qobi@purdue.edu)
Work started as stripped-down version of higher-order tower code published by Jerzy Karczmarczuk (jerzy.karczmarczuk@info.unicaen.fr) which used a non-standard standard prelude.
Initial perturbation-confusing code is a modified version of http://cdsmith.wordpress.com/2007/11/29/some-playing-with-derivatives/
Tag trick, called "branding" in the Haskell community, from Björn Buckwalter (bjorn.buckwalter@gmail.com) http://thread.gmane.org/gmane.comp.lang.haskell.cafe/22308/
Notes:
Each invocation of the differentiation function introduces a distinct perturbation, which requires a distinct derivative-carrying number type. In order to prevent these from being confused, tagging, called branding in the Haskell community, is used. This seems to prevent perturbation confusion, although it would be nice to have an actual proof of this. The technique does require adding invocations of lift at appropriate places when nesting is present, and degrades modularity by exposing forall types in type signatures.
- data Tower tag a
- lift :: (Eq a, Num a) => a -> Tower tag a
- primal :: (Eq a, Num a) => Tower tag a -> a
- diffUU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> b
- diffUF :: (Eq a, Num a, Eq b, Num b, Functor f) => (forall tag. Tower tag a -> f (Tower tag b)) -> a -> f b
- diffMU :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [a] -> [a] -> b
- diffMF :: (Eq a, Num a, Eq b, Num b, Functor f) => (forall tag. [Tower tag a] -> f (Tower tag b)) -> [a] -> [a] -> f b
- diff2UU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> (b, b)
- diff2UF :: (Eq a, Num a, Eq b, Num b, Functor f) => (forall tag. Tower tag a -> f (Tower tag b)) -> a -> (f b, f b)
- diff2MU :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [a] -> [a] -> (b, b)
- diff2MF :: (Eq a, Num a, Eq b, Num b, Functor f) => (forall tag. [Tower tag a] -> f (Tower tag b)) -> [a] -> [a] -> (f b, f b)
- diffsUU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> [b]
- diffsUF :: (Eq a, Num a, Eq b, Num b, Functor f, Foldable f) => (forall tag. Tower tag a -> f (Tower tag b)) -> a -> [f b]
- diffsMU :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [[a]] -> [b]
- diffsMF :: (Eq a, Num a, Eq b, Num b, Functor f, Foldable f) => (forall tag. [Tower tag a] -> f (Tower tag b)) -> [[a]] -> [f b]
- diffs0UU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> [b]
- diffs0UF :: (Eq a, Num a, Eq b, Num b, Functor f, Foldable f) => (forall tag. Tower tag a -> f (Tower tag b)) -> a -> [f b]
- diffs0MU :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [[a]] -> [b]
- diffs0MF :: (Eq a, Num a, Eq b, Num b, Functor f, Foldable f) => (forall tag. [Tower tag a] -> f (Tower tag b)) -> [[a]] -> [f b]
- diff :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> b
- diff2 :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> (b, b)
- diffs :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> [b]
- diffs0 :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> [b]
- grad :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [a] -> [b]
- jacobian :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> [Tower tag b]) -> [a] -> [[b]]
- zeroNewton :: (Eq a, Fractional a) => (forall tag. Tower tag a -> Tower tag a) -> a -> [a]
- inverseNewton :: (Eq a, Fractional a) => (forall tag. Tower tag a -> Tower tag a) -> a -> a -> [a]
- fixedPointNewton :: (Eq a, Fractional a) => (forall tag. Tower tag a -> Tower tag a) -> a -> [a]
- extremumNewton :: (Eq a, Fractional a) => (forall tag tag1. Tower tag1 (Tower tag a) -> Tower tag1 (Tower tag a)) -> a -> [a]
- argminNaiveGradient :: (Eq a, Fractional a, Ord a) => (forall tag. [Tower tag a] -> Tower tag a) -> [a] -> [[a]]
- primalUU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> b
- primalUF :: (Eq a, Num a, Eq b, Num b, Functor fb) => (forall tag. Tower tag a -> fb (Tower tag b)) -> a -> fb b
- primalFU :: (Eq a, Num a, Eq b, Num b, Functor fa) => (forall tag. fa (Tower tag a) -> Tower tag b) -> fa a -> b
- primalFF :: (Eq a, Num a, Eq b, Num b, Functor fa, Functor fb) => (forall tag. fa (Tower tag a) -> fb (Tower tag b)) -> fa a -> fb b
- taylor :: (Eq a, Fractional a) => (forall tag. Tower tag a -> Tower tag a) -> a -> a -> [a]
- taylor2 :: (Eq a, Fractional a) => (forall tag0 tag. Tower tag0 (Tower tag a) -> Tower tag0 (Tower tag a) -> Tower tag0 (Tower tag a)) -> a -> a -> a -> a -> [[a]]
Derivative Towers: Higher-Order Generalized Dual Numbers
The Tower
type is a concrete representation of a higher-order
Dual number, meaning a number augmented with a tower of
derivatives. These generalize the Dual numbers of Clifford (1873),
which hold only a first derivative. They can be converted to
formal power series via division by the sequence of factorials.
(Enum a, Eq a, Num a, Ord a) => Enum (Tower tag a) | |
(Eq a, Num a) => Eq (Tower tag a) | |
(Fractional (Tower tag a), Eq a, Floating a) => Floating (Tower tag a) | |
(Num (Tower tag a), Eq a, Fractional a) => Fractional (Tower tag a) | |
(Real (Tower tag a), Enum (Tower tag a), Integral a) => Integral (Tower tag a) | |
(Eq a, Num a) => Num (Tower tag a) | |
(Eq (Tower tag a), Ord a, Eq a, Num a) => Ord (Tower tag a) | |
(Num (Tower tag a), Ord (Tower tag a), Real a) => Real (Tower tag a) | |
(RealFrac (Tower tag a), Floating (Tower tag a), RealFloat a, RealFrac a) => RealFloat (Tower tag a) | |
(Real (Tower tag a), Fractional (Tower tag a), RealFrac a) => RealFrac (Tower tag a) | |
Show a => Show (Tower tag a) |
First-Order Differentiation Operators
These have two-letter suffices for the arity of the input and output of the passed functions: U for univariate, meaning a number, M for multivariate, meaning a list of numbers.
When the input is multivariate a directional derivative is calculated; this requires an additional "direction" parameter. The multivariate case is treated as a list (on input) and as a functor of arbitrary shape, which includes lists as a special case, on output.
Naming convention:
diff{U/M}{U/F}
- Derivative-taking operators that return a primal/first-derivative pair, for all combinations of scalar/nonscalar input & output
diff2{U/M}{U/F}
- Derivative-taking operators that calculate a (primal, first-derivative) pair, for all combinations of scalar/nonscalar input & output
diffUU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> bSource
The diffUU
function calculates the first derivative of a
scalar-to-scalar function.
diffUF :: (Eq a, Num a, Eq b, Num b, Functor f) => (forall tag. Tower tag a -> f (Tower tag b)) -> a -> f bSource
The diffUF
function calculates the first derivative of
scalar-to-nonscalar function.
diffMU :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [a] -> [a] -> bSource
The diffMU
function calculate the product of the Jacobian of a
nonscalar-to-scalar function with a given vector. Aka: directional
derivative.
diffMF :: (Eq a, Num a, Eq b, Num b, Functor f) => (forall tag. [Tower tag a] -> f (Tower tag b)) -> [a] -> [a] -> f bSource
The diffMF
function calculates the product of the Jacobian of a
nonscalar-to-nonscalar function with a given vector. Aka:
directional derivative.
diff2UU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> (b, b)Source
The diff2UU
function calculates the value and derivative, as a
pair, of a scalar-to-scalar function.
diff2UF :: (Eq a, Num a, Eq b, Num b, Functor f) => (forall tag. Tower tag a -> f (Tower tag b)) -> a -> (f b, f b)Source
The diffUF2
function calculates the value and derivative, as a
pair, of a scalar-to-nonscalar function.
diff2MU :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [a] -> [a] -> (b, b)Source
The diffMU2
function calculates the value and directional
derivative, as a pair, of a nonscalar-to-scalar function.
diff2MF :: (Eq a, Num a, Eq b, Num b, Functor f) => (forall tag. [Tower tag a] -> f (Tower tag b)) -> [a] -> [a] -> (f b, f b)Source
The diffMF2
function calculates the value and directional
derivative, as a pair, of a nonscalar-to-nonscalar function.
Higher-Order Differentiation Operators
Naming convention:
diffs{U/M}{U/F}
- : Derivative-taking operators that return a list [primal, first-derivative, 2nd-derivative, ...], for all combinations of scalar/nonscalar input & output.
diffsUU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> [b]Source
The diffsUU
function calculates a list of derivatives of a
scalar-to-scalar function. The 0-th element of the list is the
primal value, the 1-st element is the first derivative, etc.
diffsUF :: (Eq a, Num a, Eq b, Num b, Functor f, Foldable f) => (forall tag. Tower tag a -> f (Tower tag b)) -> a -> [f b]Source
The diffsUF
function calculates an infinite list of derivatives
of a scalar-to-nonscalar function. The 0-th element of the list is
the primal value, the 1-st element is the first derivative, etc.
diffsMU :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [[a]] -> [b]Source
The diffsMU
function calculates an infinite list of derivatives
of a nonscalar-to-scalar function. The 0-th element of the list is
the primal value, the 1-st element is the first derivative, etc.
The input is a (possibly truncated) list of the primal, first
derivative, etc, of the input.
diffsMF :: (Eq a, Num a, Eq b, Num b, Functor f, Foldable f) => (forall tag. [Tower tag a] -> f (Tower tag b)) -> [[a]] -> [f b]Source
The diffsMF
function calculates an infinite list of derivatives
of a nonscalar-to-nonscalar function. The 0-th element of the list
is the primal value, the 1-st element is the first derivative, etc.
The input is a (possibly truncated) list of the primal, first
derivative, etc, of the input.
diffs0UU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> [b]Source
diffs0UF :: (Eq a, Num a, Eq b, Num b, Functor f, Foldable f) => (forall tag. Tower tag a -> f (Tower tag b)) -> a -> [f b]Source
diffs0MU :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [[a]] -> [b]Source
diffs0MF :: (Eq a, Num a, Eq b, Num b, Functor f, Foldable f) => (forall tag. [Tower tag a] -> f (Tower tag b)) -> [[a]] -> [f b]Source
Common access patterns
diff2 :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> (b, b)Source
grad :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> Tower tag b) -> [a] -> [b]Source
The grad
function calculates the gradient of a
nonscalar-to-scalar function, using n invocations of forward AD,
where n is the input dimmensionality. NOTE: this is O(n)
inefficient as compared to reverse AD.
jacobian :: (Eq a, Num a, Eq b, Num b) => (forall tag. [Tower tag a] -> [Tower tag b]) -> [a] -> [[b]]Source
The jacobian
function calcualtes the Jacobian of a
nonscalar-to-nonscalar function, using n invocations of forward AD,
where n is the input dimmensionality.
Optimization Routines
zeroNewton :: (Eq a, Fractional a) => (forall tag. Tower tag a -> Tower tag a) -> a -> [a]Source
The zeroNewton
function finds a zero of a scalar function using
Newton's method; its output is a stream of increasingly accurate
results. (Modulo the usual caveats.)
TEST CASE:
take 10 $ zeroNewton (\x->x^2-4) 1 -- converge to 2.0
TEST CASE
:module Data.Complex Numeric.FAD
take 10 $ zeroNewton ((+1).(^2)) (1 :+ 1) -- converge to (0 :+ 1)
inverseNewton :: (Eq a, Fractional a) => (forall tag. Tower tag a -> Tower tag a) -> a -> a -> [a]Source
The inverseNewton
function inverts a scalar function using
Newton's method; its output is a stream of increasingly accurate
results. (Modulo the usual caveats.)
TEST CASE:
take 10 $ inverseNewton sqrt 1 (sqrt 10) -- converge to 10
fixedPointNewton :: (Eq a, Fractional a) => (forall tag. Tower tag a -> Tower tag a) -> a -> [a]Source
The fixedPointNewton
function find a fixedpoint of a scalar
function using Newton's method; its output is a stream of
increasingly accurate results. (Modulo the usual caveats.)
extremumNewton :: (Eq a, Fractional a) => (forall tag tag1. Tower tag1 (Tower tag a) -> Tower tag1 (Tower tag a)) -> a -> [a]Source
The extremumNewton
function finds an extremum of a scalar
function using Newton's method; produces a stream of increasingly
accurate results. (Modulo the usual caveats.)
argminNaiveGradient :: (Eq a, Fractional a, Ord a) => (forall tag. [Tower tag a] -> Tower tag a) -> [a] -> [[a]]Source
The argminNaiveGradient
function performs a multivariate
optimization, based on the naive-gradient-descent in the file
stalingrad/examples/flow-tests/pre-saddle-1a.vlad
from the
VLAD compiler Stalingrad sources. Its output is a stream of
increasingly accurate results. (Modulo the usual caveats.) The
gradient is calculated using Forward AD, which is O(n) inefficient
as compared to Reverse AD, where n is the input dimensionality.
unlift lifted functions
primalUU :: (Eq a, Num a, Eq b, Num b) => (forall tag. Tower tag a -> Tower tag b) -> a -> bSource
The primalUU
function lowers a function over dual numbers to a
function in the primal domain, where the function is
scalar-to-scalar.
primalUF :: (Eq a, Num a, Eq b, Num b, Functor fb) => (forall tag. Tower tag a -> fb (Tower tag b)) -> a -> fb bSource
The primalUF
function lowers a function over dual numbers to a
function over primals, where the function is scalar-to-nonscalar.
primalFU :: (Eq a, Num a, Eq b, Num b, Functor fa) => (forall tag. fa (Tower tag a) -> Tower tag b) -> fa a -> bSource
The primalFU
function lowers a function over dual numbers to a
function over primals where the function is nonscalar-to-scalar.
primalFF :: (Eq a, Num a, Eq b, Num b, Functor fa, Functor fb) => (forall tag. fa (Tower tag a) -> fb (Tower tag b)) -> fa a -> fb bSource
The primalFF
function lowers a function over dual numbers to a
function over primals where the function is nonscalar-to-nonscalar.
Miscellaneous
taylor :: (Eq a, Fractional a) => (forall tag. Tower tag a -> Tower tag a) -> a -> a -> [a]Source
The taylor
function evaluate a Taylor series of the given
function around the given point with the given delta. It returns a
list of increasingly higher-order approximations.
EXAMPLE: taylor exp 0 1