accelerate: An embedded language for accelerated array processing

[ accelerate, bsd3, compilers-interpreters, concurrency, data, library, parallelism ] [ Propose Tags ]

Data.Array.Accelerate defines an embedded array language for computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations, such as maps, reductions, and permutations. These computations may then be online compiled and executed on a range of architectures.

A simple example

As a simple example, consider the computation of a dot product of two vectors of floating point numbers:

dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)

Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance - for example, using Data.Array.Accelerate.LLVM.PTX it may be on-the-fly off-loaded to the GPU.

See the Data.Array.Accelerate module for further information.

Additional components

The following supported add-ons are available as separate packages. Install them from Hackage with cabal install <package>

  • accelerate-llvm-native: Backend supporting parallel execution on multicore CPUs.

  • accelerate-llvm-ptx: Backend supporting parallel execution on CUDA-capable NVIDIA GPUs. Requires a GPU with compute capability 2.0 or greater. See the following table for supported GPUs: http://en.wikipedia.org/wiki/CUDA#Supported_GPUs

  • accelerate-examples: Computational kernels and applications showcasing the use of Accelerate as well as a regression test suite, supporting function and performance testing.

  • accelerate-io: Fast conversions between Accelerate arrays and other array formats (including vector and repa).

  • accelerate-fft: Discrete Fourier transforms, with FFI bindings to optimised implementations.

  • accelerate-bignum: Fixed-width large integer arithmetic.

  • colour-accelerate: Colour representations in Accelerate (RGB, sRGB, HSV, and HSL).

  • gloss-accelerate: Generate gloss pictures from Accelerate.

  • gloss-raster-accelerate: Parallel rendering of raster images and animations.

  • lens-accelerate: Lens operators for Accelerate types.

  • linear-accelerate: Linear vector spaces in Accelerate.

  • mwc-random-accelerate: Generate Accelerate arrays filled with high quality pseudorandom numbers.

Examples and documentation

Haddock documentation is included in the package

The accelerate-examples package demonstrates a range of computational kernels and several complete applications, including:

  • An implementation of the Canny edge detection algorithm

  • An interactive Mandelbrot set generator

  • A particle-based simulation of stable fluid flows

  • An n-body simulation of gravitational attraction between solid particles

  • An implementation of the PageRank algorithm

  • A simple interactive ray tracer

  • A cellular automata simulation

  • A "password recovery" tool, for dictionary lookup of MD5 hashes

lulesh-accelerate is an implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is highly simplified and hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.

Mailing list and contacts

[Skip to Readme]

Modules

[Index] [Quick Jump]

Flags

Automatic Flags
NameDescriptionDefault
debug

Enable debug tracing messages. The following options are read from the environment variable ACCELERATE_FLAGS, and via the command-line as:

./program +ACC ... -ACC

Note that a backend may not implement (or be applicable to) all options.

The following flags control phases of the compiler. The are enabled with -f<flag> and can be reveresed with -fno-<flag>:

  • acc-sharing: Enable sharing recovery of array expressions (True).

  • exp-sharing: Enable sharing recovery of scalar expressions (True).

  • fusion: Enable array fusion (True).

  • simplify: Enable program simplification phase (True).

  • flush-cache: Clear any persistent caches on program startup (False).

  • force-recomp: Force recompilation of array programs (False).

  • fast-math: Allow algebraically equivalent transformations which may change floating point results (e.g., reassociate) (True).

The following options control debug message output, and are enabled with -d<flag>.

  • verbose: Be extra chatty.

  • dump-phases: Print timing information about each phase of the compiler. Enable GC stats (+RTS -t or otherwise) for memory usage information.

  • dump-sharing: Print information related to sharing recovery.

  • dump-simpl-stats: Print statistics related to fusion & simplification.

  • dump-simpl-iterations: Print a summary after each simplifier iteration.

  • dump-vectorisation: Print information related to the vectoriser.

  • dump-dot: Generate a representation of the program graph in Graphviz DOT format.

  • dump-simpl-dot: Generate a more compact representation of the program graph in Graphviz DOT format. In particular, scalar expressions are elided.

  • dump-gc: Print information related to the Accelerate garbage collector.

  • dump-gc-stats: Print aggregate garbage collection information at the end of program execution.

  • dubug-cc: Include debug symbols in the generated and compiled kernels.

  • dump-cc: Print information related to kernel code generation/compilation. Print the generated code if verbose.

  • dump-ld: Print information related to runtime linking.

  • dump-asm: Print information related to kernel assembly. Print the assembled code if verbose.

  • dump-exec: Print information related to program execution.

  • dump-sched: Print information related to execution scheduling.

Disabled
ekg

Enable hooks for monitoring the running application using EKG. Implies debug mode. In order to view the metrics, your application will need to call Data.Array.Accelerate.Debug.beginMonitoring before running any Accelerate computations. This will launch the server on the local machine at port 8000.

Alternatively, if you wish to configure the EKG monitoring server you can initialise it like so:

import Data.Array.Accelerate.Debug

import System.Metrics
import System.Remote.Monitoring

main :: IO ()
main = do
  store  <- initAccMetrics
  registerGcMetrics store      -- optional

  server <- forkServerWith store "localhost" 8000

  ...

Note that, as with any program utilising EKG, in order to collect Haskell GC statistics, you must either run the program with:

+RTS -T -RTS

or compile it with:

-with-rtsopts=-T
Disabled
bounds-checks

Enable bounds checking

Enabled
unsafe-checks

Enable bounds checking in unsafe operations

Disabled
internal-checks

Enable internal consistency checks

Disabled
nofib

You can disable building the nofib test suite with this flag. Disabling this is an unsupported configuration, but is useful for accelerating builds.

Enabled

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

Versions [RSS] 0.4.0, 0.5.0.0, 0.6.0.0, 0.7.1.0, 0.8.0.0, 0.8.1.0, 0.9.0.0, 0.9.0.1, 0.10.0.0, 0.12.0.0, 0.12.1.0, 0.12.2.0, 0.13.0.0, 0.13.0.1, 0.13.0.2, 0.13.0.3, 0.13.0.4, 0.13.0.5, 0.14.0.0, 0.15.0.0, 0.15.1.0, 1.0.0.0, 1.1.0.0, 1.1.1.0, 1.2.0.0, 1.2.0.1, 1.3.0.0
Change log CHANGELOG.md
Dependencies ansi-terminal (>=0.6.2), ansi-wl-pprint (>=0.6), async (>=2.0), base (>=4.7 && <4.13), base-orphans (>=0.3), bytestring (>=0.10.2), constraints (>=0.9), containers (>=0.3), cryptonite (>=0.21), deepseq (>=1.3), directory (>=1.0), ekg (>=0.1), ekg-core (>=0.1), exceptions (>=0.6), filepath (>=1.0), ghc-prim, half (>=0.2), hashable (>=1.1), hashtables (>=1.2.3), hedgehog (>=0.5), lens (>=4.0), mtl (>=2.0), tasty (>=0.11), tasty-expected-failure (>=0.11), tasty-hedgehog (>=0.1), tasty-hunit (>=0.9), template-haskell, terminal-size (>=0.3), text (>=1.0), th-lift-instances (>=0.1), transformers (>=0.3), unique, unix, unordered-containers (>=0.2), vector (>=0.10), Win32 [details]
License BSD-3-Clause
Author Manuel M T Chakravarty, Robert Clifton-Everest, Gabriele Keller, Ben Lever, Trevor L. McDonell, Ryan Newtown, Sean Seefried
Maintainer Trevor L. McDonell <tmcdonell@cse.unsw.edu.au>
Category Compilers/Interpreters, Concurrency, Data, Parallelism
Home page https://github.com/AccelerateHS/accelerate/
Bug tracker https://github.com/AccelerateHS/accelerate/issues
Source repo head: git clone git://github.com/AccelerateHS/accelerate.git
this: git clone git://github.com/AccelerateHS/accelerate.git(tag v1.2.0.1)
Uploaded by TrevorMcDonell at 2018-10-07T11:30:26Z
Distributions
Reverse Dependencies 44 direct, 10 indirect [details]
Downloads 31840 total (69 in the last 30 days)
Rating 2.5 (votes: 6) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2018-10-07 [all 1 reports]

Readme for accelerate-1.2.0.1

[back to package description]

An Embedded Language for Accelerated Array Computations

Travis AppVeyor Stackage LTS Stackage Nightly Hackage Gitter

Data.Array.Accelerate defines an embedded language of array computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations (such as maps, reductions, and permutations). These computations are online-compiled and executed on a range of architectures.

For more details, see our papers:

There are also slides from some fairly recent presentations:

Chapter 6 of Simon Marlow's book Parallel and Concurrent Programming in Haskell contains a tutorial introduction to Accelerate.

Trevor's PhD thesis details the design and implementation of frontend optimisations and CUDA backend.

Table of Contents

A simple example

As a simple example, consider the computation of a dot product of two vectors of single-precision floating-point numbers:

dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)

Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance; for example, using Data.Array.Accelerate.LLVM.PTX.run it may be on-the-fly off-loaded to a GPU.

Availability

Package accelerate is available from

  • Hackage: accelerate - install with cabal install accelerate
  • GitHub: AccelerateHS/accelerate - get the source with git clone https://github.com/AccelerateHS/accelerate.git. The easiest way to compile the source distributions is via the Haskell stack tool.

Additional components

The following supported add-ons are available as separate packages:

Install them from Hackage with cabal install PACKAGENAME.

Documentation

  • Haddock documentation is included and linked with the individual package releases on Hackage.
  • Haddock documentation for in-development components can be found here.
  • The idea behind the HOAS (higher-order abstract syntax) to de-Bruijn conversion used in the library is described separately.

Examples

accelerate-examples

The accelerate-examples package provides a range of computational kernels and a few complete applications. To install these from Hackage, issue cabal install accelerate-examples. The examples include:

  • An implementation of canny edge detection
  • An interactive mandelbrot set generator
  • An N-body simulation of gravitational attraction between solid particles
  • An implementation of the PageRank algorithm
  • A simple ray-tracer
  • A particle based simulation of stable fluid flows
  • A cellular automata simulation
  • A "password recovery" tool, for dictionary lookup of MD5 hashes

Mandelbrot Raytracer

LULESH

LULESH-accelerate is in implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is a highly simplified application, hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.

LULESH mesh

Λ ○ λ (Lol)

Λ ○ λ (Lol) is a general-purpose library for ring-based lattice cryptography. Lol has applications in, for example, symmetric-key somewhat-homomorphic encryption schemes. The lol-accelerate package provides an Accelerate backend for Lol.

Additional examples

Accelerate users have also built some substantial applications of their own. Please feel free to add your own examples!

  • Henning Thielemann, patch-image: Combine a collage of overlapping images
  • apunktbau, bildpunkt: A ray-marching distance field renderer
  • klarh, hasdy: Molecular dynamics in Haskell using Accelerate
  • Alexandros Gremm used Accelerate as part of the 2014 CSCS summer school (code)

Mailing list and contacts

The maintainers of Accelerate are Manuel M T Chakravarty chak@cse.unsw.edu.au and Trevor L McDonell tmcdonell@cse.unsw.edu.au.

Citing Accelerate

If you use Accelerate for academic research, you are encouraged (though not required) to cite the following papers (BibTeX):

  • Manuel M. T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. Accelerating Haskell Array Codes with Multicore GPUs. In DAMP '11: Declarative Aspects of Multicore Programming, ACM, 2011.

  • Trevor L. McDonell, Manuel M. T. Chakravarty, Gabriele Keller, and Ben Lippmeier. Optimising Purely Functional GPU Programs. In ICFP '13: The 18th ACM SIGPLAN International Conference on Functional Programming, ACM, 2013.

  • Robert Clifton-Everest, Trevor L. McDonell, Manuel M. T. Chakravarty, and Gabriele Keller. Embedding Foreign Code. In PADL '14: The 16th International Symposium on Practical Aspects of Declarative Languages, Springer-Verlag, LNCS, 2014.

  • Trevor L. McDonell, Manuel M. T. Chakravarty, Vinod Grover, and Ryan R. Newton. Type-safe Runtime Code Generation: Accelerate to LLVM. In Haskell '15: The 8th ACM SIGPLAN Symposium on Haskell, ACM, 2015.

Accelerate is primarily developed by academics, so citations matter a lot to us. As an added benefit, you increase Accelerate's exposure and potential user (and developer!) base, which is a benefit to all users of Accelerate. Thanks in advance!

What's missing?

Here is a list of features that are currently missing:

  • Preliminary API (parts of the API may still change in subsequent releases)