Copyright	[2009..2017] Trevor L. McDonell
License	BSD
Safe Haskell	None
Language	Haskell98

Foreign.CUDA.Runtime.Marshal

Contents

Host Allocation
Device Allocation
Unified Memory Allocation
Marshalling
Combined Allocation and Marshalling
Utility

Description

Memory management for CUDA devices

Synopsis

Host Allocation

data AllocFlag Source #

Options for host allocation

Constructors

Portable
DeviceMapped
WriteCombined

Instances

Bounded AllocFlag Source #
Methods minBound :: AllocFlag # maxBound :: AllocFlag #
Enum AllocFlag Source #
Methods succ :: AllocFlag -> AllocFlag # pred :: AllocFlag -> AllocFlag # toEnum :: Int -> AllocFlag # fromEnum :: AllocFlag -> Int # enumFrom :: AllocFlag -> [AllocFlag] # enumFromThen :: AllocFlag -> AllocFlag -> [AllocFlag] # enumFromTo :: AllocFlag -> AllocFlag -> [AllocFlag] # enumFromThenTo :: AllocFlag -> AllocFlag -> AllocFlag -> [AllocFlag] #
Eq AllocFlag Source #
Methods (==) :: AllocFlag -> AllocFlag -> Bool # (/=) :: AllocFlag -> AllocFlag -> Bool #
Show AllocFlag Source #
Methods showsPrec :: Int -> AllocFlag -> ShowS # show :: AllocFlag -> String # showList :: [AllocFlag] -> ShowS #

mallocHostArray :: Storable a => [AllocFlag] -> Int -> IO (HostPtr a) Source #

Allocate a section of linear memory on the host which is page-locked and directly accessible from the device. The storage is sufficient to hold the given number of elements of a storable type. The runtime system automatically accelerates calls to functions such as peekArrayAsync and pokeArrayAsync that refer to page-locked memory.

Note that since the amount of pageable memory is thusly reduced, overall system performance may suffer. This is best used sparingly to allocate staging areas for data exchange

freeHost :: HostPtr a -> IO () Source #

Free page-locked host memory previously allocated with mallecHost

Device Allocation

mallocArray :: Storable a => Int -> IO (DevicePtr a) Source #

Allocate a section of linear memory on the device, and return a reference to it. The memory is sufficient to hold the given number of elements of storable type. It is suitable aligned, and not cleared.

allocaArray :: Storable a => Int -> (DevicePtr a -> IO b) -> IO b Source #

Execute a computation, passing a pointer to a temporarily allocated block of memory sufficient to hold the given number of elements of storable type. The memory is freed when the computation terminates (normally or via an exception), so the pointer must not be used after this.

Note that kernel launches can be asynchronous, so you may need to add a synchronisation point at the end of the computation.

free :: DevicePtr a -> IO () Source #

Free previously allocated memory on the device

Unified Memory Allocation

data AttachFlag Source #

Options for unified memory allocations

Constructors

Global
Host
Single

Instances

Bounded AttachFlag Source #
Methods minBound :: AttachFlag # maxBound :: AttachFlag #
Enum AttachFlag Source #
Methods succ :: AttachFlag -> AttachFlag # pred :: AttachFlag -> AttachFlag # toEnum :: Int -> AttachFlag # fromEnum :: AttachFlag -> Int # enumFrom :: AttachFlag -> [AttachFlag] # enumFromThen :: AttachFlag -> AttachFlag -> [AttachFlag] # enumFromTo :: AttachFlag -> AttachFlag -> [AttachFlag] # enumFromThenTo :: AttachFlag -> AttachFlag -> AttachFlag -> [AttachFlag] #
Eq AttachFlag Source #
Methods (==) :: AttachFlag -> AttachFlag -> Bool # (/=) :: AttachFlag -> AttachFlag -> Bool #
Show AttachFlag Source #
Methods showsPrec :: Int -> AttachFlag -> ShowS # show :: AttachFlag -> String # showList :: [AttachFlag] -> ShowS #

mallocManagedArray :: Storable a => [AttachFlag] -> Int -> IO (DevicePtr a) Source #

Allocates memory that will be automatically managed by the Unified Memory system

Marshalling

peekArray :: Storable a => Int -> DevicePtr a -> Ptr a -> IO () Source #

Copy a number of elements from the device to host memory. This is a synchronous operation.

peekArrayAsync :: Storable a => Int -> DevicePtr a -> HostPtr a -> Maybe Stream -> IO () Source #

Copy memory from the device asynchronously, possibly associated with a particular stream. The destination memory must be page locked.

peekArray2D Source #

Arguments

:: Storable a
=> Int	width to copy (elements)
-> Int	height to copy (elements)
-> DevicePtr a	source array
-> Int	source array width
-> Ptr a	destination array
-> Int	destination array width
-> IO ()

Copy a 2D memory area from the device to the host. This is a synchronous operation.

peekArray2DAsync Source #

Arguments

:: Storable a
=> Int	width to copy (elements)
-> Int	height to copy (elements)
-> DevicePtr a	source array
-> Int	source array width
-> HostPtr a	destination array
-> Int	destination array width
-> Maybe Stream
-> IO ()

Copy a 2D memory area from the device to the host asynchronously, possibly associated with a particular stream. The destination array must be page locked.

peekListArray :: Storable a => Int -> DevicePtr a -> IO [a] Source #

Copy a number of elements from the device into a new Haskell list. Note that this requires two memory copies: firstly from the device into a heap allocated array, and from there marshalled into a list

pokeArray :: Storable a => Int -> Ptr a -> DevicePtr a -> IO () Source #

Copy a number of elements onto the device. This is a synchronous operation.

pokeArrayAsync :: Storable a => Int -> HostPtr a -> DevicePtr a -> Maybe Stream -> IO () Source #

Copy memory onto the device asynchronously, possibly associated with a particular stream. The source memory must be page-locked.

pokeArray2D Source #

Arguments

:: Storable a
=> Int	width to copy (elements)
-> Int	height to copy (elements)
-> Ptr a	source array
-> Int	source array width
-> DevicePtr a	destination array
-> Int	destination array width
-> IO ()

Copy a 2D memory area onto the device. This is a synchronous operation.

pokeArray2DAsync Source #

Arguments

:: Storable a
=> Int	width to copy (elements)
-> Int	height to copy (elements)
-> HostPtr a	source array
-> Int	source array width
-> DevicePtr a	destination array
-> Int	destination array width
-> Maybe Stream
-> IO ()

Copy a 2D memory area onto the device asynchronously, possibly associated with a particular stream. The source array must be page locked.

pokeListArray :: Storable a => [a] -> DevicePtr a -> IO () Source #

Write a list of storable elements into a device array. The array must be sufficiently large to hold the entire list. This requires two marshalling operations

copyArray :: Storable a => Int -> DevicePtr a -> DevicePtr a -> IO () Source #

Copy the given number of elements from the first device array (source) to the second (destination). The copied areas may not overlap. This operation is asynchronous with respect to host, but will not overlap other device operations.

copyArrayAsync :: Storable a => Int -> DevicePtr a -> DevicePtr a -> Maybe Stream -> IO () Source #

Copy the given number of elements from the first device array (source) to the second (destination). The copied areas may not overlap. This operation is asynchronous with respect to the host, and may be associated with a particular stream.

copyArray2D Source #

Arguments

:: Storable a
=> Int	width to copy (elements)
-> Int	height to copy (elements)
-> DevicePtr a	source array
-> Int	source array width
-> DevicePtr a	destination array
-> Int	destination array width
-> IO ()

Copy a 2D memory area from the first device array (source) to the second (destination). The copied areas may not overlap. This operation is asynchronous with respect to the host, but will not overlap other device operations.

copyArray2DAsync Source #

Arguments

:: Storable a
=> Int	width to copy (elements)
-> Int	height to copy (elements)
-> DevicePtr a	source array
-> Int	source array width
-> DevicePtr a	destination array
-> Int	destination array width
-> Maybe Stream
-> IO ()

Copy a 2D memory area from the first device array (source) to the second device array (destination). The copied areas may not overlay. This operation is asynchronous with respect to the host, and may be associated with a particular stream.

Combined Allocation and Marshalling

newListArray :: Storable a => [a] -> IO (DevicePtr a) Source #

Write a list of storable elements into a newly allocated device array. This is newListArrayLen composed with fst.

newListArrayLen :: Storable a => [a] -> IO (DevicePtr a, Int) Source #

Write a list of storable elements into a newly allocated device array, returning the device pointer together with the number of elements that were written. Note that this requires two copy operations: firstly from a Haskell list into a heap-allocated array, and from there into device memory. The array should be freed when no longer required.

withListArray :: Storable a => [a] -> (DevicePtr a -> IO b) -> IO b Source #

Temporarily store a list of elements into a newly allocated device array. An IO action is applied to the array, the result of which is returned. Similar to newListArray, this requires two marshalling operations of the data.

As with allocaArray, the memory is freed once the action completes, so you should not return the pointer from the action, and be sure that any asynchronous operations (such as kernel execution) have completed.

withListArrayLen :: Storable a => [a] -> (Int -> DevicePtr a -> IO b) -> IO b Source #

A variant of withListArray which also supplies the number of elements in the array to the applied function

Utility

memset Source #

Arguments

:: DevicePtr a	The device memory
-> Int64	Number of bytes
-> Int8	Value to set for each byte
-> IO ()

Initialise device memory to a given 8-bit value