blaze-builder-0.4.0.0: Efficient buffered output.

Copyright(c) 2013 Leon P Smith
LicenseBSD3
MaintainerLeon P Smith <leon@melding-monads.com>
Stabilityexperimental
Safe HaskellNone
LanguageHaskell98

Blaze.ByteString.Builder

Contents

Description

Blaze.ByteString.Builder is the main module, which you should import as a user of the blaze-builder library.

import Blaze.ByteString.Builder

It provides you with a type Builder that allows to efficiently construct lazy bytestrings with a large average chunk size.

Intuitively, a Builder denotes the construction of a part of a lazy bytestring. Builders can either be created using one of the primitive combinators in Blaze.ByteString.Builder.Write or by using one of the predefined combinators for standard Haskell values (see the exposed modules of this package). Concatenation of builders is done using mappend from the Monoid typeclass.

Here is a small example that serializes a list of strings using the UTF-8 encoding.

 import Blaze.ByteString.Builder.Char.Utf8
strings :: [String]
strings = replicate 10000 "Hello there!"

The function fromString creates a Builder denoting the UTF-8 encoded argument. Hence, UTF-8 encoding and concatenating all strings can be done follows.

concatenation :: Builder
concatenation = mconcat $ map fromString strings

The function toLazyByteString can be used to execute a Builder and obtain the resulting lazy bytestring.

result :: L.ByteString
result = toLazyByteString concatenation

The result is a lazy bytestring containing 10000 repetitions of the string "Hello there!" encoded using UTF-8. The corresponding 120000 bytes are distributed among three chunks of 32kb and a last chunk of 6kb.

A note on history. This serialization library was inspired by the Data.Binary.Builder module provided by the binary package. It was originally developed with the specific needs of the blaze-html package in mind. Since then it has been restructured to serve as a drop-in replacement for Data.Binary.Builder, which it improves upon both in speed as well as expressivity.

Synopsis

The Builder type

data Builder :: *

Builders denote sequences of bytes. They are Monoids where mempty is the zero-length sequence and mappend is concatenation, which runs in O(1).

Creating builders

flush :: Builder

Flush the current buffer. This introduces a chunk boundary.

Executing builders

toLazyByteString :: Builder -> ByteString

Execute a Builder and return the generated chunks as a lazy ByteString. The work is performed lazy, i.e., only when a chunk of the lazy ByteString is forced.

toLazyByteStringWith Source

Arguments

:: Int

Buffer size (upper-bounds the resulting chunk size).

-> Int

This parameter is ignored as of blaze-builder-0.4

-> Int

Size of the first buffer to be used and copied for larger resulting sequences

-> Builder

Builder to run.

-> ByteString

Lazy bytestring to output after the builder is finished.

-> ByteString

Resulting lazy bytestring

Run a Builder with the given buffer sizes.

Use this function for integrating the Builder type with other libraries that generate lazy bytestrings.

Note that the builders should guarantee that on average the desired chunk size is attained. Builders may decide to start a new buffer and not completely fill the existing buffer, if this is faster. However, they should not spill too much of the buffer, if they cannot compensate for it.

FIXME: Note that the following paragraphs are not entirely correct as of blaze-builder-0.4:

A call toLazyByteStringWith bufSize minBufSize firstBufSize will generate a lazy bytestring according to the following strategy. First, we allocate a buffer of size firstBufSize and start filling it. If it overflows, we allocate a buffer of size minBufSize and copy the first buffer to it in order to avoid generating a too small chunk. Finally, every next buffer will be of size bufSize. This, slow startup strategy is required to achieve good speed for short (<200 bytes) resulting bytestrings, as for them the allocation cost is of a large buffer cannot be compensated. Moreover, this strategy also allows us to avoid spilling too much memory for short resulting bytestrings.

Note that setting firstBufSize >= minBufSize implies that the first buffer is no longer copied but allocated and filled directly. Hence, setting firstBufSize = bufSize means that all chunks will use an underlying buffer of size bufSize. This is recommended, if you know that you always output more than minBufSize bytes.

toByteString :: Builder -> ByteString Source

Run the builder to construct a strict bytestring containing the sequence of bytes denoted by the builder. This is done by first serializing to a lazy bytestring and then packing its chunks to a appropriately sized strict bytestring.

toByteString = packChunks . toLazyByteString

Note that toByteString is a Monoid homomorphism.

toByteString mempty          == mempty
toByteString (x `mappend` y) == toByteString x `mappend` toByteString y

However, in the second equation, the left-hand-side is generally faster to execute.

toByteStringIO :: (ByteString -> IO ()) -> Builder -> IO () Source

toByteStringIOWith bufSize io b runs the builder b with a buffer of at least the size bufSize and executes the IO action io whenever the buffer is full.

Compared to toLazyByteStringWith this function requires less allocation, as the output buffer is only allocated once at the start of the serialization and whenever something bigger than the current buffer size has to be copied into the buffer, which should happen very seldomly for the default buffer size of 32kb. Hence, the pressure on the garbage collector is reduced, which can be an advantage when building long sequences of bytes.

toByteStringIOWith Source

Arguments

:: Int

Buffer size (upper bounds the number of bytes forced per call to the IO action).

-> (ByteString -> IO ())

IO action to execute per full buffer, which is referenced by a strict ByteString.

-> Builder

Builder to run.

-> IO ()

Resulting IO action.

Writes

data Write Source

A write of a bounded number of bytes.

When defining a function write :: a -> Write for some a, then it is important to ensure that the bound on the number of bytes written is data-independent. Formally,

 forall x y. getBound (write x) = getBound (write y)

The idea is that this data-independent bound is specified such that the compiler can optimize the check, if there are enough free bytes in the buffer, to a single subtraction between the pointer to the next free byte and the pointer to the end of the buffer with this constant bound of the maximal number of bytes to be written.

Instances

fromWrite :: Write -> Builder Source

Create a builder that execute a single Write.

fromWriteList :: (a -> Write) -> [a] -> Builder Source

Construct a Builder writing a list of data one element at a time.

writeToByteString :: Write -> ByteString Source

Run a Write to produce a strict ByteString. This is equivalent to (toByteString . fromWrite), but is more efficient because it uses just one appropriately-sized buffer.

Writing Storables

writeStorable :: Storable a => a -> Write Source

Write a storable value.

fromStorable :: Storable a => a -> Builder Source

A builder that serializes a storable value. No alignment is done.

fromStorables :: Storable a => [a] -> Builder Source

A builder that serializes a list of storable values by writing them consecutively. No alignment is done. Parsing information needs to be provided externally.