Portability | portable |
---|---|
Stability | experimental |
Maintainer | Bryan O'Sullivan <bos@serpentine.com> |
Efficiently and correctly parse a JSON string. The string must be encoded as UTF-8.
It can be useful to think of parsing as occurring in two phases:
- Identification of the textual boundaries of a JSON value. This is always strict, so that an invalid JSON document can be rejected as soon as possible.
- Conversion of a JSON value to a Haskell value. This may be either immediate (strict) or deferred (lazy); see below for details.
The question of whether to choose a lazy or strict parser is subtle, but it can have significant performance implications, resulting in changes in CPU use and memory footprint of 30% to 50%, or occasionally more. Measure the performance of your application with each!
Lazy parsers
The json
and value
parsers decouple identification from
conversion. Identification occurs immediately (so that an invalid
JSON document can be rejected as early as possible), but conversion
to a Haskell value is deferred until that value is needed.
This decoupling can be time-efficient if only a smallish subset of elements in a JSON value need to be inspected, since the cost of conversion is zero for uninspected elements. The trade off is an increase in memory usage, due to allocation of thunks for values that have not yet been converted.
Parse a top-level JSON value. This must be either an object or an array, per RFC 4627.
The conversion of a parsed value to a Haskell value is deferred until the Haskell value is needed. This may improve performance if only a subset of the results of conversions are needed, but at a cost in thunk allocation.
Parse any JSON value. You should usually json
in preference to
this function, as this function relaxes the object-or-array
requirement of RFC 4627.
In particular, be careful in using this function if you think your
code might interoperate with Javascript. A naïve Javascript
library that parses JSON data using eval
is vulnerable to attack
unless the encoded data represents an object or an array. JSON
implementations in other languages conform to that same restriction
to preserve interoperability and security.
Strict parsers
The json'
and value'
parsers combine identification with
conversion. They consume more CPU cycles up front, but have a
smaller memory footprint.
Parse a top-level JSON value. This must be either an object or an array, per RFC 4627.
This is a strict version of json
which avoids building up thunks
during parsing; it performs all conversions immediately. Prefer
this version if most of the JSON data needs to be accessed.