Safe Haskell	Safe-Inferred
Language	Haskell2010

Streamly.External.LMDB

Contents

Environment
- Mode
- Limits
Database
Reading
- Read-only transactions and cursors
- Read options
Writing
Error types

Description

Acknowledgments

The functionality for the limits and getting the environment and database, in particular the idea of specifying the read-only or read-write mode at the type level, was mostly obtained from the lmdb-simple library.

Synopsis

data Environment mode
openEnvironment :: Mode mode => FilePath -> Limits -> IO (Environment mode)
isReadOnlyEnvironment :: Mode mode => Environment mode -> Bool
closeEnvironment :: Mode mode => Environment mode -> IO ()
class Mode a
data ReadWrite
data ReadOnly
data Limits = Limits {
- mapSize :: !Int
- maxDatabases :: !Int
- maxReaders :: !Int
}
defaultLimits :: Limits
gibibyte :: Int
tebibyte :: Int
data Database mode
getDatabase :: Mode mode => Environment mode -> Maybe String -> IO (Database mode)
clearDatabase :: Mode mode => Database mode -> IO ()
closeDatabase :: Mode mode => Database mode -> IO ()
readLMDB :: (MonadIO m, Mode mode) => Database mode -> Maybe (ReadOnlyTxn, Cursor) -> ReadOptions -> Unfold m Void (ByteString, ByteString)
unsafeReadLMDB :: (MonadIO m, Mode mode) => Database mode -> Maybe (ReadOnlyTxn, Cursor) -> ReadOptions -> (CStringLen -> IO k) -> (CStringLen -> IO v) -> Unfold m Void (k, v)
data ReadOnlyTxn
beginReadOnlyTxn :: Environment mode -> IO ReadOnlyTxn
abortReadOnlyTxn :: ReadOnlyTxn -> IO ()
data Cursor
openCursor :: ReadOnlyTxn -> Database mode -> IO Cursor
closeCursor :: Cursor -> IO ()
data ReadOptions = ReadOptions {
- readDirection :: !ReadDirection
- readStart :: !(Maybe ByteString)
- readUnsafeFFI :: !Bool
}
defaultReadOptions :: ReadOptions
data ReadDirection
- = Forward
- | Backward
writeLMDB :: MonadIO m => Database ReadWrite -> WriteOptions -> Fold m (ByteString, ByteString) ()
data WriteOptions = WriteOptions {
- writeTransactionSize :: !Int
- writeOverwriteOptions :: !OverwriteOptions
- writeAppend :: !Bool
- writeUnsafeFFI :: !Bool
}
defaultWriteOptions :: WriteOptions
data OverwriteOptions
- = OverwriteAllow
- | OverwriteAllowSame
- | OverwriteDisallow
data LMDB_Error = LMDB_Error {
- e_context :: String
- e_description :: String
- e_code :: Either Int MDB_ErrCode
}
data MDB_ErrCode
- = MDB_KEYEXIST
- | MDB_NOTFOUND
- | MDB_PAGE_NOTFOUND
- | MDB_CORRUPTED
- | MDB_PANIC
- | MDB_VERSION_MISMATCH
- | MDB_INVALID
- | MDB_MAP_FULL
- | MDB_DBS_FULL
- | MDB_READERS_FULL
- | MDB_TLS_FULL
- | MDB_TXN_FULL
- | MDB_CURSOR_FULL
- | MDB_PAGE_FULL
- | MDB_MAP_RESIZED
- | MDB_INCOMPATIBLE
- | MDB_BAD_RSLOT
- | MDB_BAD_TXN
- | MDB_BAD_VALSIZE
- | MDB_BAD_DBI

Environment

With LMDB, one first creates a so-called “environment,” which one can think of as a file or folder on disk.

data Environment mode Source #

openEnvironment :: Mode mode => FilePath -> Limits -> IO (Environment mode) Source #

Open an LMDB environment in either ReadWrite or ReadOnly mode. The FilePath argument may be either a directory or a regular file, but it must already exist. If a regular file, an additional file with "-lock" appended to the name is used for the reader lock table.

Note that an environment must have been opened in ReadWrite mode at least once before it can be opened in ReadOnly mode.

An environment opened in ReadOnly mode may still modify the reader lock table (except when the filesystem is read-only, in which case no locks are used).

isReadOnlyEnvironment :: Mode mode => Environment mode -> Bool Source #

closeEnvironment :: Mode mode => Environment mode -> IO () Source #

Closes the given environment.

If you have merely a few dozen environments at most, there should be no need for this. (It is a common practice with LMDB to create one’s environments once and reuse them for the remainder of the program’s execution.) If you find yourself needing this, it is your responsibility to heed the documented caveats.

In particular, you will probably, before calling this function, want to (a) use closeDatabase, and (b) pass in precreated transactions and cursors to readLMDB and unsafeReadLMDB to make sure there are no transactions or cursors still left to be cleaned up by the garbage collector. (As an alternative to (b), one could try manually triggering the garbage collector.)

Mode

class Mode a Source #

Minimal complete definition

isReadOnlyMode

Instances

Instances details

Mode ReadOnly Source #
Instance details Defined in Streamly.External.LMDB.Internal Methods isReadOnlyMode :: ReadOnly -> Bool Source #
Mode ReadWrite Source #
Instance details Defined in Streamly.External.LMDB.Internal Methods isReadOnlyMode :: ReadWrite -> Bool Source #

data ReadWrite Source #

Instances

Instances details

Mode ReadWrite Source #
Instance details Defined in Streamly.External.LMDB.Internal Methods isReadOnlyMode :: ReadWrite -> Bool Source #

data ReadOnly Source #

Instances

Instances details

Mode ReadOnly Source #
Instance details Defined in Streamly.External.LMDB.Internal Methods isReadOnlyMode :: ReadOnly -> Bool Source #

Limits

data Limits Source #

LMDB environments have various limits on the size and number of databases and concurrent readers.

Constructors

Limits
Fields mapSize :: !Int Memory map size, in bytes (also the maximum size of all databases). maxDatabases :: !Int Maximum number of named databases. maxReaders :: !Int Maximum number of concurrent `ReadOnly` transactions (also the number of slots in the lock table).

defaultLimits :: Limits Source #

The default limits are 1 MiB map size, 0 named databases, and 126 concurrent readers. These can be adjusted freely, and in particular the mapSize may be set very large (limited only by available address space). However, LMDB is not optimized for a large number of named databases so maxDatabases should be kept to a minimum.

The default mapSize is intentionally small, and should be changed to something appropriate for your application. It ought to be a multiple of the OS page size, and should be chosen as large as possible to accommodate future growth of the database(s). Once set for an environment, this limit cannot be reduced to a value smaller than the space already consumed by the environment, however it can later be increased.

If you are going to use any named databases then you will need to change maxDatabases to the number of named databases you plan to use. However, you do not need to change this field if you are only going to use the single main (unnamed) database.

gibibyte :: Int Source #

A convenience constant for obtaining a 1 GiB map size.

tebibyte :: Int Source #

A convenience constant for obtaining a 1 TiB map size.

Database

After creating an environment, one creates within it one or more databases.

data Database mode Source #

getDatabase :: Mode mode => Environment mode -> Maybe String -> IO (Database mode) Source #

Gets a database with the given name. When creating a database (i.e., getting it for the first time), one must do so in ReadWrite mode.

If only one database is desired within the environment, the name can be Nothing (known as the “unnamed database”).

If one or more named databases (a database with a Just name) are desired, the maxDatabases of the environment’s limits should have been adjusted accordingly. The unnamed database will in this case contain the names of the named databases as keys, which one is allowed to read but not write.

clearDatabase :: Mode mode => Database mode -> IO () Source #

Clears, i.e., removes all key-value pairs from, the given database.

closeDatabase :: Mode mode => Database mode -> IO () Source #

Closes the given database.

If you have merely a few dozen databases at most, there should be no need for this. (It is a common practice with LMDB to create one’s databases once and reuse them for the remainder of the program’s execution.) If you find yourself needing this, it is your responsibility to heed the documented caveats.

Reading

readLMDB :: (MonadIO m, Mode mode) => Database mode -> Maybe (ReadOnlyTxn, Cursor) -> ReadOptions -> Unfold m Void (ByteString, ByteString) Source #

Creates an unfold with which we can stream key-value pairs from the given database.

If an existing read-only transaction and cursor are not provided, a read-only transaction and cursor are automatically created and kept open for the duration of the unfold; we suggest doing this as a first option. However, if you find this to be a bottleneck (e.g., if you find upon profiling that a significant time is being spent at mdb_txn_begin, or if you find yourself having to increase maxReaders in the environment’s limits because the transactions and cursors are not being garbage collected fast enough), consider precreating a transaction and cursor using beginReadOnlyTxn and openCursor.

In any case, bear in mind at all times LMDB’s caveats regarding long-lived transactions.

If you don’t want the overhead of intermediate ByteStrings (on your way to your eventual data structures), use unsafeReadLMDB instead.

unsafeReadLMDB :: (MonadIO m, Mode mode) => Database mode -> Maybe (ReadOnlyTxn, Cursor) -> ReadOptions -> (CStringLen -> IO k) -> (CStringLen -> IO v) -> Unfold m Void (k, v) Source #

Similar to readLMDB, except that the keys and values are not automatically converted into Haskell ByteStrings.

To ensure safety, make sure that the memory pointed to by the CStringLen for each key/value mapping function call is (a) only read (and not written to); and (b) not used after the mapping function has returned. One way to transform the CStringLens to your desired data structures is to use unsafePackCStringLen.

Read-only transactions and cursors

data ReadOnlyTxn Source #

beginReadOnlyTxn :: Environment mode -> IO ReadOnlyTxn Source #

Begins an LMDB read-only transaction for use with readLMDB or unsafeReadLMDB. It is your responsibility to (a) use the transaction only on databases in the same environment, (b) make sure that those databases were already obtained before the transaction was begun, and (c) dispose of the transaction with abortReadOnlyTxn.

abortReadOnlyTxn :: ReadOnlyTxn -> IO () Source #

Disposes of a read-only transaction created with beginReadOnlyTxn.

data Cursor Source #

openCursor :: ReadOnlyTxn -> Database mode -> IO Cursor Source #

Opens a cursor for use with readLMDB or unsafeReadLMDB. It is your responsibility to (a) make sure the cursor only gets used by a single readLMDB or unsafeReadLMDB Unfold at the same time (to be safe, one can open a new cursor for every readLMDB or unsafeReadLMDB call), (b) make sure the provided database is within the environment on which the provided transaction was begun, and (c) dispose of the cursor with closeCursor (logically before abortReadOnlyTxn, although the order doesn’t really matter for read-only transactions).

closeCursor :: Cursor -> IO () Source #

Disposes of a cursor created with openCursor.

Read options

data ReadOptions Source #

Constructors

ReadOptions

Fields

readDirection :: !ReadDirection
readStart :: !(Maybe ByteString)
If Nothing, a forward [backward] iteration starts at the beginning [end] of the database. Otherwise, it starts at the first key that is greater [less] than or equal to the Just key.
readUnsafeFFI :: !Bool
Use unsafe FFI calls under the hood. This can increase iteration speed, but one should bear in mind that unsafe FFI calls can have an adverse impact on the performance of the rest of the program (e.g., its ability to effectively spawn green threads).

Instances

Instances details

Show ReadOptions Source #
Instance details Defined in Streamly.External.LMDB Methods showsPrec :: Int -> ReadOptions -> ShowS Source # show :: ReadOptions -> String Source # showList :: [ReadOptions] -> ShowS Source #

defaultReadOptions :: ReadOptions Source #

By default, we start reading from the beginning of the database (i.e., from the smallest key), and we don’t use unsafe FFI calls.

data ReadDirection Source #

Direction of key iteration.

Constructors

Forward
Backward

Instances

Instances details

Show ReadDirection Source #
Instance details Defined in Streamly.External.LMDB Methods showsPrec :: Int -> ReadDirection -> ShowS Source # show :: ReadDirection -> String Source # showList :: [ReadDirection] -> ShowS Source #

Writing

writeLMDB :: MonadIO m => Database ReadWrite -> WriteOptions -> Fold m (ByteString, ByteString) () Source #

Creates a fold with which we can stream key-value pairs into the given database.

It is the responsibility of the user to execute the fold on a bound thread.

The fold currently cannot be used with a scan. (The plan is for this shortcoming to be remedied with or after a future release of streamly that addresses the underlying issue.)

Please specify a suitable transaction size in the write options; the default of 1 (one write transaction for each key-value pair) could yield suboptimal performance. One could try, e.g., 100 KB chunks and benchmark from there.

data WriteOptions Source #

Constructors

WriteOptions

Fields

writeTransactionSize :: !Int
The number of key-value pairs per write transaction.
writeOverwriteOptions :: !OverwriteOptions
writeAppend :: !Bool
Assume the input data is already ordered. This allows the use of MDB_APPEND under the hood and substantially improves write performance. An exception will be thrown if the assumption about the ordering is not true.
writeUnsafeFFI :: !Bool
Use unsafe FFI calls under the hood. This can increase write performance, but one should bear in mind that unsafe FFI calls can have an adverse impact on the performance of the rest of the program (e.g., its ability to effectively spawn green threads).

defaultWriteOptions :: WriteOptions Source #

By default, we use a write transaction size of 1 (one write transaction for each key-value pair), allow overwriting, don’t assume that the input data is already ordered, and don’t use unsafe FFI calls.

data OverwriteOptions Source #

Constructors

OverwriteAllow	When a key reoccurs, overwrite the value.
OverwriteAllowSame	When a key reoccurs, throw an exception except when the value is the same.
OverwriteDisallow	When a key reoccurs, throw an exception.

Instances

Instances details

Eq OverwriteOptions Source #
Instance details Defined in Streamly.External.LMDB Methods (==) :: OverwriteOptions -> OverwriteOptions -> Bool Source # (/=) :: OverwriteOptions -> OverwriteOptions -> Bool Source #

Error types

data LMDB_Error Source #

Constructors

LMDB_Error
Fields e_context :: String e_description :: String e_code :: Either Int MDB_ErrCode

Instances

Instances details

Exception LMDB_Error Source #
Instance details Defined in Streamly.External.LMDB.Internal.Foreign Methods toException :: LMDB_Error -> SomeException Source # fromException :: SomeException -> Maybe LMDB_Error Source # displayException :: LMDB_Error -> String Source #
Show LMDB_Error Source #
Instance details Defined in Streamly.External.LMDB.Internal.Foreign Methods showsPrec :: Int -> LMDB_Error -> ShowS Source # show :: LMDB_Error -> String Source # showList :: [LMDB_Error] -> ShowS Source #

data MDB_ErrCode Source #

Instances

Instances details

Show MDB_ErrCode Source #
Instance details Defined in Streamly.External.LMDB.Internal.Foreign Methods showsPrec :: Int -> MDB_ErrCode -> ShowS Source # show :: MDB_ErrCode -> String Source # showList :: [MDB_ErrCode] -> ShowS Source #
Eq MDB_ErrCode Source #
Instance details Defined in Streamly.External.LMDB.Internal.Foreign Methods (==) :: MDB_ErrCode -> MDB_ErrCode -> Bool Source # (/=) :: MDB_ErrCode -> MDB_ErrCode -> Bool Source #

MDB_KEYEXIST
MDB_NOTFOUND
MDB_PAGE_NOTFOUND
MDB_CORRUPTED
MDB_PANIC
MDB_VERSION_MISMATCH
MDB_INVALID
MDB_MAP_FULL
MDB_DBS_FULL
MDB_READERS_FULL
MDB_TLS_FULL
MDB_TXN_FULL
MDB_CURSOR_FULL
MDB_PAGE_FULL
MDB_MAP_RESIZED
MDB_INCOMPATIBLE
MDB_BAD_RSLOT
MDB_BAD_TXN
MDB_BAD_VALSIZE
MDB_BAD_DBI