outsort: External sorting package based on Conduit

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

Warnings:

External (disk-backed) sorting package based on Conduit, saving intermediate files to disk and later merging them all.


[Skip to Readme]

Properties

Versions 0.1.0, 0.1.0
Change log ChangeLog
Dependencies async, base (>=4.7 && <5), bytestring, conduit, conduit-algorithms, conduit-combinators, conduit-extra, containers, deepseq, directory, exceptions, filemanip, filepath, MissingH, primitive, resourcet, safe, safeio, temporary, text, transformers, transformers-base, vector, vector-algorithms [details]
License MIT
Author
Maintainer Luis Pedro Coelho <luis@luispedro.org>
Category Algorithms
Uploaded by luispedro at 2019-07-11T06:38:36Z

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees


Readme for outsort-0.1.0

[back to package description]

Outsort: generic (Haskell-based) external sorting

Example

    import qualified Data.Conduit.Combinators as CC
    import qualified Data.Conduit.Binary as CB

    import Algorithms.OutSort (isolateBySize)
    import Algorithms.SortMain (sortMain)

    main :: IO ()
    main = sortMain
        CB.lines
        CC.unlinesAscii
        (isolateBySize (const 1) 500000)

All that is needed is a decoder (ConduitT ByteString a m ()), an encoder (ConduitT ByteString a m ()), and a function to split the input into blocks (ConduitT a a m ()). Given these elements, the result is a programme which can sort arbitrarily large inputs using external memory.

Licence: MIT

Author: Luis Pedro Coelho (email: coelho@embl.de) (on twitter: @luispedrocoelho)