name: data-dispersal -- The package version. See the Haskell package versioning policy (PVP) -- for standards guiding when and how versions should be incremented. -- http://www.haskell.org/haskellwiki/Package_versioning_policy -- PVP summary: +-+------- breaking API changes -- | | +----- non-breaking API additions -- | | | +--- code changes with no API change version: 1.0.0.2 synopsis: Space-efficient and privacy-preserving data dispersal algorithms. description: Given a ByteString of length @D@, we encode the ByteString as a list of @n@ 'Fragment's, each containing a ByteString of length @O(D/m)@. Then, each fragment could be stored on a separate machine to obtain fault-tolerance: Even if all but @m@ of these machines crash, we can still reconstruct the original ByteString out of the remaining @m@ fragments. Note that the total space requirement of the @m@ fragments is @m * O(D/m)=O(D),@ which is clearly space-optimal. The total space required for the n fragments is @O((n/m)*D)@. Note that @m@ and @n@ can be chosen to be of the same order, so the asymptotic storage overhead for getting good fault-tolerance increases only by a constant factor. . /GHCi Example:/ . > > :m + Data.IDA > > let msg = Data.ByteString.Char8.pack "my really important data" > > let fragments = encode 5 15 msg > -- Now we could distributed the fragments on different sites to add some > -- fault-tolerance. > > let frags' = drop 5 $ take 10 fragments -- let's pretend that 10 machines crashed > -- Let's look at the 5 fragments that we have left: > > mapM_ (Prelude.putStrLn . show) frags' > (6,[273,771,899,737,285]) > (7,[289,939,612,285,936]) > (8,[424,781,1001,322,788]) > (9,[143,657,790,157,423]) > (10,[314,674,418,888,423]) > -- Space-efficiency: Note that the length of each of the 5 fragments is 5 > -- and our original message has length 24. > > decode frags' > "my really important data" . /Encrypted Fragments:/ . The module @Data.IDA@ contains an information dispersal algorithm that produces space-optimal fragments. However, the knowledge of 1 or more fragments might allow an adversary to deduce some information about the original data. The module @Crypto.IDA@ combines information dispersal with secret sharing: the knowledge of up to @m-1@ fragments does not leak any information about the original data. . This could be useful in scenarios where we need to store data at untrusted storage sites: To this end, we store one encrypted fragment at each site. If at most @m-1@ of these untrusted sites collude, they will still be unable to obtain any information about the original data. The added security comes at the price of a slightly increased fragment size (by an additional constant 32 bytes) and an additional overhead in the running time of the encoding/decoding process. The algorithm is fully described in module "Crypto.IDA". . /Fault-Tolerance:/ . Suppose that we have @N@ machines and encode our data as @2log(N)@ fragments with reconstruction threshold m = @log(N)@. Let's assume that we store each fragment on a separate machine and each machine fails (independently) with probability at most 0.5. . * What is the probability of our data being safe? @Pr[ at most n-m machines crash ] >= 1-0.5^(log(N)) = 1-N^(-1).@ . * What is the overhead in terms of space that we pay for this level of fault-tolerance? We have n fragments, each of size @O(D\/m)@, so the total space is @O(n D\/ m) = 2D.@ In other words, we can guarantee that the data survives with high probability by increasing the required space by a constant factor. . This library is based on the following works: . * \"Efficient Dispersal of Information for Security, Load Balancing, and Fault Tolerance\", by Michael O. Rabin, JACM 1989. . * \"How to share a secret.\" by Adi Shamir. In Communications of the ACM 22 (11): 612–613, 1979. . * \"Secret Sharing Made Short\" Hugo Krawczyk. CRYPTO 1993: 136-146 license: LGPL-2.1 license-file: LICENSE author: Peter Robinson maintainer: peter.robinson@monoid.at copyright: Peter Robinson 2014 category: Data, Cryptography build-type: Simple cabal-version: >=1.8 homepage: http://monoid.at/code library hs-source-dirs: src exposed-modules: Data.IDA Data.IDA.Internal Data.IDA.FiniteField Crypto.IDA build-depends: base >=4.6 && < 5 ,array >= 0.4.0.1 ,vector >= 0.10.11.0 ,binary >= 0.7.2.1 ,bytestring >= 0.10.0.2 ,syb >= 0.4.0 ,binary >= 0.5.1.1 ,finite-field >= 0.8.0 ,matrix >= 0.3.4.0 ,AES >= 0.2.9 ,entropy >= 0.3.2 ,secret-sharing >= 1.0.0.0 ghc-options: -W test-suite Main type: exitcode-stdio-1.0 x-uses-tf: true build-depends: base >= 4 && < 5 ,QuickCheck >= 2.4 ,test-framework >= 0.4.1 ,test-framework-quickcheck2 ,array >= 0.4.0.1 ,vector >= 0.10.11.0 ,spool >= 0.1 ,binary >= 0.7.2.1 ,bytestring >= 0.10.0.2 ,syb >= 0.4.0 hs-source-dirs: src, tests main-is: Tests.hs