tokenizer-streaming: A variant of tokenizer-monad that supports streaming.

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

This monad transformer is a modification of tokenizer-monad that can work on streams of text/string chunks or even on (Unicode) bytestring streams.


[Skip to Readme]

Properties

Versions 0.1.0.0, 0.1.0.1, 0.1.0.1
Change log CHANGELOG.md
Dependencies base (>=4.9 && <5.0), bytestring, mtl, streaming, streaming-bytestring (>=0.1.6), streaming-commons (>=0.2.1.0 && <0.3), text, tokenizer-monad (>=0.2.2.0 && <0.3) [details]
License GPL-3.0-only
Copyright (c) 2019 Enum Cohrs
Author Enum Cohrs
Maintainer darcs@enumeration.eu
Category Text
Source repo head: darcs get https://hub.darcs.net/enum/tokenizer-streaming
Uploaded by implementation at 2019-01-22T21:39:10Z

Modules

[Index] [Quick Jump]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees


Readme for tokenizer-streaming-0.1.0.1

[back to package description]

tokenizer-streaming

Motivation: You might have stumpled upon the package tokenizer-monad. It is another project by me, for writing tokenizers that act on pure text/strings. However, there are situations when you cannot keep all the text in memory. You might want to tokenize text from network streams or from large corpus files.

Main idea: A monad transformer called TokenizerT implements exactly the same methods as Tokenizer from tokenizer-monad, such that all tokenizers can be ported without code changes (if you used MonadTokenizer in the type signatures)

Supported text types