Copyright | [2009..2017] Trevor L. McDonell |
---|---|
License | BSD |
Safe Haskell | None |
Language | Haskell98 |
Memory management for CUDA devices
- data AllocFlag
- mallocHostArray :: Storable a => [AllocFlag] -> Int -> IO (HostPtr a)
- freeHost :: HostPtr a -> IO ()
- mallocArray :: Storable a => Int -> IO (DevicePtr a)
- allocaArray :: Storable a => Int -> (DevicePtr a -> IO b) -> IO b
- free :: DevicePtr a -> IO ()
- data AttachFlag
- mallocManagedArray :: Storable a => [AttachFlag] -> Int -> IO (DevicePtr a)
- peekArray :: Storable a => Int -> DevicePtr a -> Ptr a -> IO ()
- peekArrayAsync :: Storable a => Int -> DevicePtr a -> HostPtr a -> Maybe Stream -> IO ()
- peekArray2D :: Storable a => Int -> Int -> DevicePtr a -> Int -> Ptr a -> Int -> IO ()
- peekArray2DAsync :: Storable a => Int -> Int -> DevicePtr a -> Int -> HostPtr a -> Int -> Maybe Stream -> IO ()
- peekListArray :: Storable a => Int -> DevicePtr a -> IO [a]
- pokeArray :: Storable a => Int -> Ptr a -> DevicePtr a -> IO ()
- pokeArrayAsync :: Storable a => Int -> HostPtr a -> DevicePtr a -> Maybe Stream -> IO ()
- pokeArray2D :: Storable a => Int -> Int -> Ptr a -> Int -> DevicePtr a -> Int -> IO ()
- pokeArray2DAsync :: Storable a => Int -> Int -> HostPtr a -> Int -> DevicePtr a -> Int -> Maybe Stream -> IO ()
- pokeListArray :: Storable a => [a] -> DevicePtr a -> IO ()
- copyArray :: Storable a => Int -> DevicePtr a -> DevicePtr a -> IO ()
- copyArrayAsync :: Storable a => Int -> DevicePtr a -> DevicePtr a -> Maybe Stream -> IO ()
- copyArray2D :: Storable a => Int -> Int -> DevicePtr a -> Int -> DevicePtr a -> Int -> IO ()
- copyArray2DAsync :: Storable a => Int -> Int -> DevicePtr a -> Int -> DevicePtr a -> Int -> Maybe Stream -> IO ()
- newListArray :: Storable a => [a] -> IO (DevicePtr a)
- newListArrayLen :: Storable a => [a] -> IO (DevicePtr a, Int)
- withListArray :: Storable a => [a] -> (DevicePtr a -> IO b) -> IO b
- withListArrayLen :: Storable a => [a] -> (Int -> DevicePtr a -> IO b) -> IO b
- memset :: DevicePtr a -> Int64 -> Int8 -> IO ()
Host Allocation
Options for host allocation
mallocHostArray :: Storable a => [AllocFlag] -> Int -> IO (HostPtr a) Source #
Allocate a section of linear memory on the host which is page-locked and
directly accessible from the device. The storage is sufficient to hold the
given number of elements of a storable type. The runtime system automatically
accelerates calls to functions such as peekArrayAsync
and pokeArrayAsync
that refer to page-locked memory.
Note that since the amount of pageable memory is thusly reduced, overall system performance may suffer. This is best used sparingly to allocate staging areas for data exchange
freeHost :: HostPtr a -> IO () Source #
Free page-locked host memory previously allocated with mallecHost
Device Allocation
mallocArray :: Storable a => Int -> IO (DevicePtr a) Source #
Allocate a section of linear memory on the device, and return a reference to it. The memory is sufficient to hold the given number of elements of storable type. It is suitable aligned, and not cleared.
allocaArray :: Storable a => Int -> (DevicePtr a -> IO b) -> IO b Source #
Execute a computation, passing a pointer to a temporarily allocated block of memory sufficient to hold the given number of elements of storable type. The memory is freed when the computation terminates (normally or via an exception), so the pointer must not be used after this.
Note that kernel launches can be asynchronous, so you may need to add a synchronisation point at the end of the computation.
Unified Memory Allocation
data AttachFlag Source #
Options for unified memory allocations
mallocManagedArray :: Storable a => [AttachFlag] -> Int -> IO (DevicePtr a) Source #
Allocates memory that will be automatically managed by the Unified Memory system
Marshalling
peekArray :: Storable a => Int -> DevicePtr a -> Ptr a -> IO () Source #
Copy a number of elements from the device to host memory. This is a synchronous operation.
peekArrayAsync :: Storable a => Int -> DevicePtr a -> HostPtr a -> Maybe Stream -> IO () Source #
Copy memory from the device asynchronously, possibly associated with a particular stream. The destination memory must be page locked.
:: Storable a | |
=> Int | width to copy (elements) |
-> Int | height to copy (elements) |
-> DevicePtr a | source array |
-> Int | source array width |
-> Ptr a | destination array |
-> Int | destination array width |
-> IO () |
Copy a 2D memory area from the device to the host. This is a synchronous operation.
:: Storable a | |
=> Int | width to copy (elements) |
-> Int | height to copy (elements) |
-> DevicePtr a | source array |
-> Int | source array width |
-> HostPtr a | destination array |
-> Int | destination array width |
-> Maybe Stream | |
-> IO () |
Copy a 2D memory area from the device to the host asynchronously, possibly associated with a particular stream. The destination array must be page locked.
peekListArray :: Storable a => Int -> DevicePtr a -> IO [a] Source #
Copy a number of elements from the device into a new Haskell list. Note that this requires two memory copies: firstly from the device into a heap allocated array, and from there marshalled into a list
pokeArray :: Storable a => Int -> Ptr a -> DevicePtr a -> IO () Source #
Copy a number of elements onto the device. This is a synchronous operation.
pokeArrayAsync :: Storable a => Int -> HostPtr a -> DevicePtr a -> Maybe Stream -> IO () Source #
Copy memory onto the device asynchronously, possibly associated with a particular stream. The source memory must be page-locked.
:: Storable a | |
=> Int | width to copy (elements) |
-> Int | height to copy (elements) |
-> Ptr a | source array |
-> Int | source array width |
-> DevicePtr a | destination array |
-> Int | destination array width |
-> IO () |
Copy a 2D memory area onto the device. This is a synchronous operation.
:: Storable a | |
=> Int | width to copy (elements) |
-> Int | height to copy (elements) |
-> HostPtr a | source array |
-> Int | source array width |
-> DevicePtr a | destination array |
-> Int | destination array width |
-> Maybe Stream | |
-> IO () |
Copy a 2D memory area onto the device asynchronously, possibly associated with a particular stream. The source array must be page locked.
pokeListArray :: Storable a => [a] -> DevicePtr a -> IO () Source #
Write a list of storable elements into a device array. The array must be sufficiently large to hold the entire list. This requires two marshalling operations
copyArray :: Storable a => Int -> DevicePtr a -> DevicePtr a -> IO () Source #
Copy the given number of elements from the first device array (source) to the second (destination). The copied areas may not overlap. This operation is asynchronous with respect to host, but will not overlap other device operations.
copyArrayAsync :: Storable a => Int -> DevicePtr a -> DevicePtr a -> Maybe Stream -> IO () Source #
Copy the given number of elements from the first device array (source) to the second (destination). The copied areas may not overlap. This operation is asynchronous with respect to the host, and may be associated with a particular stream.
:: Storable a | |
=> Int | width to copy (elements) |
-> Int | height to copy (elements) |
-> DevicePtr a | source array |
-> Int | source array width |
-> DevicePtr a | destination array |
-> Int | destination array width |
-> IO () |
Copy a 2D memory area from the first device array (source) to the second (destination). The copied areas may not overlap. This operation is asynchronous with respect to the host, but will not overlap other device operations.
:: Storable a | |
=> Int | width to copy (elements) |
-> Int | height to copy (elements) |
-> DevicePtr a | source array |
-> Int | source array width |
-> DevicePtr a | destination array |
-> Int | destination array width |
-> Maybe Stream | |
-> IO () |
Copy a 2D memory area from the first device array (source) to the second device array (destination). The copied areas may not overlay. This operation is asynchronous with respect to the host, and may be associated with a particular stream.
Combined Allocation and Marshalling
newListArray :: Storable a => [a] -> IO (DevicePtr a) Source #
Write a list of storable elements into a newly allocated device array. This
is newListArrayLen
composed with fst
.
newListArrayLen :: Storable a => [a] -> IO (DevicePtr a, Int) Source #
Write a list of storable elements into a newly allocated device array,
returning the device pointer together with the number of elements that were
written. Note that this requires two copy operations: firstly from a Haskell
list into a heap-allocated array, and from there into device memory. The
array should be free
d when no longer required.
withListArray :: Storable a => [a] -> (DevicePtr a -> IO b) -> IO b Source #
Temporarily store a list of elements into a newly allocated device array. An
IO action is applied to the array, the result of which is returned. Similar
to newListArray
, this requires two marshalling operations of the data.
As with allocaArray
, the memory is freed once the action completes, so you
should not return the pointer from the action, and be sure that any
asynchronous operations (such as kernel execution) have completed.
withListArrayLen :: Storable a => [a] -> (Int -> DevicePtr a -> IO b) -> IO b Source #
A variant of withListArray
which also supplies the number of elements in
the array to the applied function