json-tokens: Tokenize JSON
Convert JSON to a token stream. This libary focuses on
high performance and minimal allocations. This library
is distinguished from aeson
in the following ways:
In
aeson
,decode
parses JSON by building an AST that resembles the ABNF given in RFC 7159. Notably, this converts every JSONobject
to aHashMap
. (This choice of intermediate data structure may not be appropritae depending on how the user wants to interpret theobject
). By constrast, `json-tokens` converts a document to a token sequence.For numbers,
aeson
usesscientific
, but `json-tokens` uses `scientific-notation`. Althoughscientific
and `scientific-notation` have similar APIs, `scientific-notation` includes a parser that is about 4x faster and conversion functions that are 10x faster than those found inscientific
andaeson
.For text,
aeson
uses the UTF-16-backedtext
library, but `json-tokens` uses the UTF-8-backed `text-short` library.Parsing is resumable in
aeson
, which usesattoparsec
, but not in `json-tokens`, which usesbytesmith
.In
aeson
, all batteries are included. In particular, the combination of typeclasses and GHC Generics (or Template Haskell) make it possible to elide lots of boilerplate. None of these are included in `json-tokens`.
The difference in design decisions means that solutions using
`json-tokens` are able to decode JSON twice as fast as
solutions with `aeson. In the `zeek-json` benchmark suite,
a `json-tokens`-based decoding outperforms aeson
's decode
by a factor of two. This speed comes at a cost. Users must
write more code to use `json-tokens` than they do for aeson
.
If high-throughput parsing of small JSON documents is paramount,
this cost may be worth bearing. It is always possible to go a
step further and forego tokenization entirely, parsing the
desired Haskell data type directly from a byte sequence. Doing this
in a low-allocation way while retaining both the ability the
handle JSON object
keys in any order and the ability to handle
escape sequences in object
keys is fiendishly difficult. Kudos
to the brave soul that goes down that path. For the rest of us,
`json-tokens` is a compromise worth considering.
Downloads
- json-tokens-0.1.0.1.tar.gz [browse] (Cabal source package)
- Package description (as included in the package)
Maintainer's Corner
For package maintainers and hackage trustees
Candidates
- No Candidates
Versions [RSS] | 0.1.0.0, 0.1.0.1 |
---|---|
Change log | CHANGELOG.md |
Dependencies | array-builder (>=0.1 && <0.2), array-chunks (>=0.1.1 && <0.2), base (>=4.12 && <5), byteslice (>=0.1.3 && <0.2), bytesmith (>=0.3 && <0.4), bytestring (>=0.10.8 && <0.11), primitive (>=0.7 && <0.8), scientific-notation (>=0.1 && <0.2), text-short (>=0.1.3 && <0.2) [details] |
License | BSD-3-Clause |
Copyright | 2019 Andrew Martin |
Author | Andrew Martin |
Maintainer | andrew.thaddeus@gmail.com |
Category | Data |
Home page | https://github.com/andrewthad/json-tokens |
Bug tracker | https://github.com/andrewthad/json-tokens/issues |
Uploaded | by andrewthad at 2019-09-30T12:33:17Z |
Distributions | |
Downloads | 840 total (11 in the last 30 days) |
Rating | (no votes yet) [estimated by Bayesian average] |
Your Rating | |
Status | Docs available [build log] Last success reported on 2019-09-30 [all 1 reports] |