par-dual: ParDual class for Parallel <-> Sequential

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

Defines a ParDual class for a Parallel - Sequential relationship


[Skip to Readme]

Properties

Versions 0.1.0.0, 0.1.0.0
Change log None available
Dependencies async (>=2.2.2 && <2.3), base (>=4.13.0 && <4.14), validators (>=0.0.1 && <0.1) [details]
License Apache-2.0
Copyright 2020 - Gabriel Volpe
Author Gabriel Volpe
Maintainer volpegabriel@gmail.com
Category Library
Home page https://github.com/gvolpe/par-dual
Bug tracker https://github.com/gvolpe/par-dual/issues
Uploaded by gvolpe at 2020-04-29T09:30:22Z

Modules

[Index] [Quick Jump]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees


Readme for par-dual-0.1.0.0

[back to package description]

par-dual

CI Status

The PureScript language defines a Parallel typeclass in the parallel package. Quoting its documentation:

The Parallel class abstracts over monads which support parallel composition via some related Applicative.

The same typeclass is defined in the Scala language, as part of the Cats library.

This typeclass has been controversial, in a sense, for not having strong laws. However, it has been proven to be actually useful in real-world applications.

The idea of this package is to bring this power over to the Haskell language while exploring the design space to identify and define stronger laws (if possible).

Originally, this idea has been described in this blogpost.

ParDual

Here's the definition of the same typeclass in Haskell:

class (Monad m, Applicative f) => ParDual f m | m -> f, f -> m where
  parallel :: forall a . m a -> f a
  sequential :: forall a . f a -> m a

I decided to call it ParDual instead of Parallel, because this duality doesn't always define a sequential and parallel relationship. Such is the case between [] and ZipList, as we will soon discover.

It defines two functions, which are natural transformations between a Monad m and an Applicative f. It could also be seen as a typeclass version of an isomorphism such as forall a . Iso (f a) (m a).

The most common and useful relationships are both Either / Validation and IO / Concurrently.

instance Semigroup e => ParDual (Validation e) (Either e) where
  parallel   = fromEither
  sequential = toEither

instance ParDual Concurrently IO where
  parallel   = Concurrently
  sequential = runConcurrently

Validation comes from the validators package, whereas Concurrently comes from the async package.

Additionally, we can define a lot of powerful functions solely in terms of Applicative, Monad, and ParDual. A few other functions might require extra requirements, such as Traversable.

parMapN

The parMapN set of functions are analogue to combining <$> and <*>, for any dual Applicative.

parMap2
  :: (Applicative f, Monad m, ParDual f m)
  => m a0
  -> m a1
  -> (a0 -> a1 -> a)
  -> m a

In this case, parMap2 takes only two computations and a function, but you can find other versions up to parMap6. If there is demand, we can consider abstracting over its arity, in order to compose an arbitrary number of computations.

For example, if we define a Person datatype with two fields:

type Name = Refined NonEmpty String
type Age = Refined (GreaterThan 17) Int

data Person = Person
  { personAge :: Age
  , personName :: Name
  } deriving Show

We can then validate different inputs, while accumulating errors on the left side, even when our type is Either [String] Person.

mkPerson :: Int -> String -> Either [String] Person
mkPerson a n = parMap2 (ref a) (ref n) Person

Where ref is a generic function that converts RefineExceptions to [String]:

ref :: Predicate p x => x -> Either [String] (Refined p x)
ref x = left (\e -> [show e]) (refine x)

In case of two invalid inputs, we will get as a result a list of validation errors:

mkPerson 10 "" == Left ["error 1", "error 2"]

If parMapN didn't exist, we could do the same by manually converting between Either and Validation (which is exactly what parMapN does via the ParDual class).

mkPerson :: Int -> String -> Either [String] Person
mkPerson a n = toEither $ Person <$> fromEither (ref a) <*> fromEither (ref n)

Though, we can see how cumbersome and boilerplatey it gets.

parTraverse

Another great application of the ParDual class is the definition of a traverse function that takes a Monad and a Traversable t, but that operates over its dual Applicative, and at the end it converts back to this Monad.

parTraverse
  :: (Traversable t, Applicative f, Monad m, ParDual f m)
  => (a -> m b)
  -> t a
  -> m (t b)

The type signature is exactly the same as traverse, except the constraints are different.

We can appreciate its usability by looking at some examples. Here's one with Either:

f :: Int -> Either [String] Int
f n = Left [show n]

traverse f [1..5] == Left ["1"]
parTraverse f [1..5] == Left ["1","2","3","4","5"]

Below there is another one with IO:

randomDelay :: IO ()
randomDelay = do
  r <- randomRIO (1, 10)
  threadDelay (r * 500000)

traverseIO :: IO ()
traverseIO = traverse_ (\n -> randomDelay >> print n) [1 .. 10]

parTraverseIO :: IO ()
parTraverseIO = parTraverse_ (\n -> randomDelay >> print n) [1 .. 10]

The traverse version prints out numbers from 1 to 10 in sequence, while waiting for every random delay. So the output is pretty much 1 2 3 4 5 6 7 8 9 10.

The parTraverse version has a non-deterministic output, since it goes through Concurrently (IO's dual). It is exactly what you would expect while using mapConcurrently. One possible output is 5 10 6 1 3 2 9 4 7 8.

ZipList

The dual Applicative instance of [] is the one defined by ZipList, which doesn't have anything to do with parallelism.

instance ParDual ZipList [] where
  parallel   = ZipList
  sequential = getZipList

Let's have a look at the examples shown below.

((+) <$> [1..5] <*> [6..10]) == [7,8,9,10,11,8,9,10,11,12,9,10,11,12,13,10,11,12,13,14,11,12,13,14,15]

parMap2 [1..5] [6..10] (+) == [7,9,11,13,15]

The standard version iterates over both lists "sequentially". That is, it iterates over the first one, and then over the second one, returning the cartesian product of both lists.

Conversely, ZipLists only return the sum of the current elements of both lists such as 1 + 6, 2 + 7, and so on. It iterates over both lists in "parallel", effectively traversing both at once.

parBitraverse

Operates over any Bitraversable such as Either or (,,).

res1 = [("ba",'2','T'),("ba",'2','r'),("ba",'2','u'),("ba",'2','e'),("ba",'4','T'),("ba",'4','r'),("ba",'4','u'),("ba",'4','e')]

(bitraverse show show ("ba", 24, True)) == res1

The standard bitraverse for (String, Int, Bool) traverses over the second value and then over the third value, combining the results on each iteration.

(parBitraverse show show ("ba", 24, True)) == [("ba",'2','T'),("ba",'4','r')]

The dual variant traverses all the values at the same time, terminating as soon as either value is empty.

Test suite

The test suite property-checks the functions defined in ParDual applied to different types of values.

$ nix-shell --run 'cabal new-run par-dual-tests'
━━━ Main ━━━
  ✓ prop_parMap2_on_success passed 100 tests.
  ✓ prop_parMap2_accumulates_errors passed 100 tests.
  ✓ prop_parTraverse_accumulates_errors passed 100 tests.
  ✓ prop_parTraverse_io_is_concurrent passed 10 tests.
  ✓ prop_parMap2_on_lists passed 100 tests.
  ✓ prop_parBitraverse passed 100 tests.
  ✓ 6 succeeded.

Publishing

Generating documentation and tarball file to upload.

$ cabal new-haddock --haddock-for-hackage --enable-doc
$ cabal upload -d dist-newstyle/par-dual-0.1.0.0-docs.tar.gz
$ cabal new-sdist