# streamly-lmdb
[![Hackage](https://img.shields.io/hackage/v/streamly-lmdb.svg?style=flat)](https://hackage.haskell.org/package/streamly-lmdb)
![CI](https://github.com/shlok/streamly-lmdb/workflows/CI/badge.svg?branch=master)
Stream data to or from LMDB databases using the Haskell [streamly](https://hackage.haskell.org/package/streamly) library.
## Requirements
Install LMDB on your system:
* Debian Linux: `sudo apt-get install liblmdb-dev`.
* macOS: `brew install lmdb`.
## Quick start
```haskell
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Data.Function ((&))
import qualified Streamly.Data.Fold as F
import qualified Streamly.Data.Stream.Prelude as S
import Streamly.External.LMDB
( Limits (mapSize),
WriteOptions (writeTransactionSize),
defaultLimits,
defaultReadOptions,
defaultWriteOptions,
getDatabase,
openEnvironment,
readLMDB,
tebibyte,
writeLMDB,
)
main :: IO ()
main = do
-- Open an environment. There should already exist a file or
-- directory at the given path. (Empty for a new environment.)
env <-
openEnvironment "/path/to/lmdb-database" $
defaultLimits {mapSize = tebibyte}
-- Get the main database.
-- Note: It is common practice with LMDB to create the database
-- once and reuse it for the remainder of the program’s execution.
db <- getDatabase env Nothing
-- Stream key-value pairs into the database.
let fold' = writeLMDB db defaultWriteOptions {writeTransactionSize = 1}
let writeStream = S.fromList [("baz", "a"), ("foo", "b"), ("bar", "c")]
_ <- S.fold fold' writeStream
-- Stream key-value pairs out of the
-- database, printing them along the way.
-- Output:
-- ("bar","c")
-- ("baz","a")
-- ("foo","b")
let unfold' = readLMDB db Nothing defaultReadOptions
let readStream = S.unfold unfold' undefined
S.mapM print readStream
& S.fold F.drain
```
## Benchmarks
See `bench/README.md`. Summary (with rough figures from our machine†):
* Reading:
- For iterating through a fully cached LMDB database, this library has roughly a 110 ns/pair overhead compared to C. (Plain Haskell `IO` code has roughly a 70 ns/pair overhead compared to C. The two preceding figures being similar fulfills the promise of [streamly](https://hackage.haskell.org/package/streamly) and stream fusion.)
- By using `unsafeReadLMDB` instead of `readLMDB`, we can get the overhead down to roughly 100 ns/pair.
- By additionally using the `readUnsafeFFI` option (to use `unsafe` FFI calls under the hood), we can get the overhead down to roughly 40 ns/pair.
* Writing:
- For writing to an LMDB database, this library has roughly a 210 ns/pair overhead compared to C. (Plain Haskell `IO` code has roughly a 100 ns/pair overhead compared to C. The two preceding figures being similar fulfills the promise of [streamly](https://hackage.haskell.org/package/streamly) and stream fusion.)
- By using the `writeUnsafeFFI` option (to use `unsafe` FFI calls under the hood), we can get the overhead down to roughly 140 ns/pair.
* For most Haskell programs, these differences will not cause problems. (For instance, note that merely opening and reading 1 byte from a file with C already takes us tens of *microseconds*.)
† May 2023; [Linode](https://linode.com); Debian 11, Dedicated 32GB: 16 CPU, 640GB SSD storage, 32GB RAM.