hash-addressed: Hash-addressed file storage

[ apache, filesystem, hash, library ] [ Propose Tags ]

A simple system for maintaining a directory wherein each file's name is a hash of its content.


[Skip to Readme]

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.0.0.0, 0.0.1.0, 0.1.0.0, 0.2.0.0, 0.2.0.1
Change log changelog.md
Dependencies base (>=4.16 && <4.18), base16-bytestring (>=1.0.2 && <1.1), bytestring (>=0.11.3 && <0.12), cryptohash-sha256 (>=0.11.102 && <0.12), directory (>=1.3.6 && <1.4), filepath (>=1.4.2 && <1.5), gambler (>=0.0.0 && <0.1 || >=0.1.0 && <0.2 || >=0.2.0 && <0.3 || >=0.3.0 && <0.4 || >=0.4.0 && <0.5), mtl (>=2.2.2 && <2.3 || >=2.3.1 && <2.4), pipes (>=4.3.16 && <4.4), quaalude (>=0.0.0 && <0.1), resourcet (>=1.2.5 && <1.3 || >=1.3.0 && <1.4), temporary (>=1.3 && <1.4) [details]
License Apache-2.0
Copyright 2023 Mission Valley Software LLC
Author Chris Martin
Maintainer Chris Martin, Julie Moronuki
Revised Revision 6 made by chris_martin at 2023-05-01T21:37:53Z
Category Hash, Filesystem
Home page https://github.com/typeclasses/hash-addressed
Bug tracker https://github.com/typeclasses/hash-addressed/issues
Source repo head: git clone git://github.com/typeclasses/hash-addressed.git
Uploaded by chris_martin at 2023-02-09T22:07:22Z
Distributions
Reverse Dependencies 2 direct, 0 indirect [details]
Downloads 226 total (13 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2023-02-09 [all 1 reports]

Readme for hash-addressed-0.2.0.1

[back to package description]

hash-addressed is a simple system for maintaining a directory wherein each file's name is a hash of its content.

import HashAddressed.Directory
import HashAddressed.HashFunction
import qualified Data.ByteString.Lazy as Lazy
import qualified Data.ByteString as Strict
import qualified Pipes

Directory

First define a Directory value by specifying which hash function to use and the path of the directory in which the files shall be kept.

data Directory = Directory { directoryPath :: FilePath,
                             hashFunction :: HashFunction }

Presently the only supported hash function is sha256.

Ensure that directoryPath is the path of an existing directory. You can then write files into the directory using one of three write functions: writeLazy, writeStream, and writeExcept.

writeLazy

writeLazy is the simplest to use; just give it a lazy ByteString.

writeLazy :: forall m. MonadIO m =>
    Directory -> Lazy.ByteString -> m WriteResult
data WriteResult = WriteResult{ hashAddressedFile :: FilePath,
                                writeType :: WriteType }
data WriteType = AlreadyPresent | NewContent

WriteResult gives you the path of the file in the store, including the path of the store itself. Because a hash-addressed store can never contain duplicate files, writing a file has no effect if the content is already present; the WriteType value indicates whether the file was actually written by this action or was present in the store already.

writeStream

The limitation of writeLazy is that it doesn't allow streaming. Thus enters writeStream, which uses a pipes Producer to represent the content. The producer can perform IO while generating stream content (for example, perhaps it reads byte strings from a network socket). The producer can also return a value (the commit type parameter) that will be returned alongside the WriteResult.

writeStream :: forall commit m. MonadIO m =>
    Directory
    -> Pipes.Producer Strict.ByteString IO commit
    -> m (commit, WriteResult)

All operations that write into a hash-addressed Directory are performed by first writing the content somewhere within the system temporary directory and then moving the file to its target location. This ensures that the store never makes visible the results of a partial write. If the producer throws an exception, everything written so far will be deleted and no content will be written to the Directory.

writeExcept

If there is some interesting way in which your stream can fail, you may wish to use writeExcept instead. In this variant, the producer returns an Either abort commit indicating whether the result should be committed to the store. Return a Left result to signal that an error has occurred. The writeExcept action will then throw the abort value into a MonadError context.

writeExcept :: forall abort commit m. (MonadIO m, MonadError abort m) =>
    Directory -> Pipes.Producer Strict.ByteString IO (Either abort commit)
    -> m (commit, WriteResult)