Copyright	[2009..2017] Trevor L. McDonell
License	BSD
Safe Haskell	None
Language	Haskell98

Foreign.CUDA.Runtime.Exec

Contents

Kernel Execution

Description

Kernel execution control for C-for-CUDA runtime interface

Synopsis

Kernel Execution

type Fun = FunPtr () Source #

A global device function.

Note that the use of a string naming a function was deprecated in CUDA 4.1 and removed in CUDA 5.0.

data FunAttributes Source #

Constructors

FunAttributes
Fields constSizeBytes :: !Int64 localSizeBytes :: !Int64 sharedSizeBytes :: !Int64 maxKernelThreadsPerBlock :: !Int maximum block size that can be successively launched (based on register usage) numRegs :: !Int number of registers required for each thread

Instances

Show FunAttributes Source #
Methods showsPrec :: Int -> FunAttributes -> ShowS # show :: FunAttributes -> String # showList :: [FunAttributes] -> ShowS #
Storable FunAttributes Source #
Methods sizeOf :: FunAttributes -> Int # alignment :: FunAttributes -> Int # peekElemOff :: Ptr FunAttributes -> Int -> IO FunAttributes # pokeElemOff :: Ptr FunAttributes -> Int -> FunAttributes -> IO () # peekByteOff :: Ptr b -> Int -> IO FunAttributes # pokeByteOff :: Ptr b -> Int -> FunAttributes -> IO () # peek :: Ptr FunAttributes -> IO FunAttributes # poke :: Ptr FunAttributes -> FunAttributes -> IO () #

data FunParam where Source #

Kernel function parameters. Doubles will be converted to an internal float representation on devices that do not support doubles natively.

Constructors

IArg :: !Int -> FunParam
FArg :: !Float -> FunParam
DArg :: !Double -> FunParam
VArg :: Storable a => !a -> FunParam

data CacheConfig Source #

Cache configuration preference

Constructors

None
Shared
L1
Equal

Instances

Enum CacheConfig Source #
Methods succ :: CacheConfig -> CacheConfig # pred :: CacheConfig -> CacheConfig # toEnum :: Int -> CacheConfig # fromEnum :: CacheConfig -> Int # enumFrom :: CacheConfig -> [CacheConfig] # enumFromThen :: CacheConfig -> CacheConfig -> [CacheConfig] # enumFromTo :: CacheConfig -> CacheConfig -> [CacheConfig] # enumFromThenTo :: CacheConfig -> CacheConfig -> CacheConfig -> [CacheConfig] #
Eq CacheConfig Source #
Methods (==) :: CacheConfig -> CacheConfig -> Bool # (/=) :: CacheConfig -> CacheConfig -> Bool #
Show CacheConfig Source #
Methods showsPrec :: Int -> CacheConfig -> ShowS # show :: CacheConfig -> String # showList :: [CacheConfig] -> ShowS #

attributes :: Fun -> IO FunAttributes Source #

Obtain the attributes of the named global device function. This itemises the requirements to successfully launch the given kernel.

setConfig Source #

Arguments

:: (Int, Int)	grid dimensions
-> (Int, Int, Int)	block dimensions
-> Int64	shared memory per block (bytes)
-> Maybe Stream	associated processing stream
-> IO ()

Specify the grid and block dimensions for a device call. Used in conjunction with setParams, this pushes data onto the execution stack that will be popped when a function is launched.

setParams :: [FunParam] -> IO () Source #

Set the argument parameters that will be passed to the next kernel invocation. This is used in conjunction with setConfig to control kernel execution.

setCacheConfig :: Fun -> CacheConfig -> IO () Source #

On devices where the L1 cache and shared memory use the same hardware resources, this sets the preferred cache configuration for the given device function. This is only a preference; the driver is free to choose a different configuration as required to execute the function.

Switching between configuration modes may insert a device-side synchronisation point for streamed kernel launches

launch :: Fun -> IO () Source #

Invoke the global kernel function on the device. This must be preceded by a call to setConfig and (if appropriate) setParams.

launchKernel Source #

Arguments

:: Fun	Device function symbol
-> (Int, Int)	grid dimensions
-> (Int, Int, Int)	thread block shape
-> Int64	shared memory per block (bytes)
-> Maybe Stream	(optional) execution stream
-> [FunParam]
-> IO ()

Invoke a kernel on a (gx * gy) grid of blocks, where each block contains (tx * ty * tz) threads and has access to a given number of bytes of shared memory. The launch may also be associated with a specific Stream.