futhark-0.7.4: An optimising compiler for a functional, array-oriented language.

Safe HaskellNone
LanguageHaskell2010

Futhark.CodeGen.OpenCL.Kernels

Synopsis

Documentation

data SizeHeuristic Source #

A heuristic for setting the default value for something.

data DeviceType Source #

The type of OpenCL device that this heuristic applies to.

Constructors

DeviceCPU 
DeviceGPU 

data WhichSize Source #

A size that can be assigned a default.

data HeuristicValue Source #

The value supplies by a heuristic can be a constant, or inferred from some device information.

sizeHeuristicsTable :: [SizeHeuristic] Source #

All of our heuristics.

mapTranspose :: ToIdent a => a -> Type -> TransposeType -> Func Source #

mapTranspose name elem_type transpose_type Generate a transpose kernel with requested name for elements of type elem_type. There are special support to handle input arrays with low width or low height, which can be indicated by transpose_type.

Normally when transposing a [2][n] array we would use a FUT_BLOCK_DIM x FUT_BLOCK_DIM group to process a [2][FUT_BLOCK_DIM] slice of the input array. This would mean that many of the threads in a group would be inactive. We try to remedy this by using a special kernel that will process a larger part of the input, by using more complex indexing. In our example, we could use all threads in a group if we are processing (2/FUT_BLOCK_DIM) as large a slice of each rows per group. The variable mulx contains this factor for the kernel to handle input arrays with low height.

See issue #308 on GitHub for more details.