Portability | non-portable -- posix only |
---|---|
Stability | provisional |
Maintainer | Don Stewart <dons@galois.com> |
Safe Haskell | Safe-Infered |
Lazy, chunk-wise memory mapping.
Memory map a file as a lazy ByteString. Finalisers are associated cached-sized portions of the file, which will be deallocated as those chunks go out of scope.
Unlike strict Bytestrings, mmapFile for Lazy ByteStrings will deallocate chunks of the file.
The storage manager is used to free chunks of the mapped memory. When the garbage collector notices there are no further references to a chunk, a call to munmap is made.
In effect, the file is mmapped once, lazily, then covered with finalizers for each chunk. When any chunk goes out of scope, that part is deallocated. We must allocate the spine of the structure strictly though, to ensure finalizers are registered for the entire file.
The Haskell garbage collector decides when to run based on heap
pressure, however the mmap stores memory outside the Haskell heap,
so those resources are not counted when deciding to run the garbage
collect. The result is that finalizers run less often than you might
expect, and it is possible to write a lazy bytestring mmap program
that never deallocates (and thus doesn't run in constant space).
performGC
or finalizerForeignPtr
can be used to trigger collection
at sensible points.
Note: this operation may break referential transparency! If
any other process on the system changes the file when it is mapped
into Haskell, the contents of your ByteString
will change.
- unsafeMMapFile :: FilePath -> IO ByteString
Documentation
unsafeMMapFile :: FilePath -> IO ByteStringSource
The unsafeMMapFile
function maps a file or device into memory as
a lazy ByteString, made of 64*pagesize unmappable chunks of bytes.
Memory mapped files will behave as if they were read lazily -- pages from the file will be loaded into memory on demand.
The storage manager is used to free chunks that go out of scope, and unlike strict bytestrings, memory mapped lazy ByteStrings will be deallocated in chunks (so you can write traversals that run in constant space).
However, the size of the mmapped resource is not known by the Haskell GC, it appears only as a small ForeignPtr. This means that the Haskell GC may not not run as often as you'd like, leading to delays in unmapping chunks.
Appropriate use of performGC or finalizerForeignPtr may be required to ensure deallocation, as resources allocated by mmap are not tracked by the Haskell garbage collector.
For example, when writing out a lazy bytestring allocated with mmap, you may wish to finalizeForeignPtr when each chunk is written, as the chunk goes out of scope, rather than relying on the garbage collector to notice the chunk has gone.
This operation is unsafe: if the file is written to by any other
process on the system, the ByteString
contents will change in
Haskell.