This file contains a tutorial for using this package. The first part explains
through examples how to do logic programming in Haskell using `MonadPlus`. The
second part explains how to take a logic program in the form of a `Tree` (which
is an instance of `MonadPlus`) and use the infrastructure in this package to run
it in parallel.


Logic programming
=================

In this part we shall show how to write logic programs through the use of three
example problems:  generating ordered pairs of integers, finding valid map
colorings, and finding ways to place n queens on an n x n chess board.


Ordered pairs of integers
-------------------------

Logic programming in Haskell is about making choices and applying constraints.
A simple example is the following:


```haskell
pairs :: MonadPlus m => Int -> Int -> m (Int,Int)
pairs max_x max_y = do
    x <- between 1 max_x
    y <- between 1 max_y
    guard $ x < y
    return (x,y)
```

This program generates all pairs of integers within the given ranges such that
the first element of the pair is less than the second element of the pair. The
first line in the body of `pairs`,

```haskell
x <- between 1 max_x
```

makes a non-deterministic choice for `x` that is between `1` and `max_x`
(inclusive), and likewise for `y`; the `between` function is part of the
`LogicGrowsOnTrees` module. The third line in the function body is a guard that
succeeds if `x < y` and fails otherwise; failure results in backtracking to try
another choice of `x` and/or `y`.

`pairs` returns a value that can be an instance of an arbitrary type --- i.e.,
one the caller can choose --- so long as it is an instance of `MonadPlus`. For
example, if let you let `m` be `Maybe` then `pairs` will return either nothing
if no choices of `x` and `y` satisfy the guard and otherwise it will return a
`Just` value with the first found solution. If you let `m` be the List type then
the function will return the list of all solutions.

This function illustrates the basic functionality but it is not a good example
of how you would actually generate such pairs; a better implementation is given
by:

```haskell
pairs max_x max_y = do
    x <- between 1 max_x
    y <- between (x+1) max_y
    return (x,y)
```

This is more efficient because it restricts the choice of `y` to only those
values that will satisfy the constraint. This is actually an important
optimization that one should always try to make when working on non-trivial
problems: when possible, one should enforce a constraint by reducing the set of
available choices to those that meet the constraint rather than by generating a
larger set of choices and then applying a filter to eliminate those that don't
meet the constraint.


Map coloring
------------

For our next example, we consider the problem of coloring a map. That is, we are
given a list of countries, a relation that tells us which are adjacent, and a
list of colors, and our goal is to find a way to choose a color for each country
such that no two adjacent countries have the same color. To keep things simple,
we will assume that the colors are numbered from `1` to `number_of_colors` and
the countries are numbered from `1` to `number_of_countries`. A function that
generates solutions to this problem is as follows:

```haskell
coloringSolutions :: MonadPlus m => Word -> Word -> (Word -> Word -> Bool) -> m [(Word,Word)]
coloringSolutions number_of_colors number_of_countries isAdjacentTo =
    foldM addCountryToColoring [] [1..number_of_countries]
  where
    addCountryToColoring coloring country = do
        color <- between 1 number_of_colors
        forM_ coloring $ \(other_country, other_color) ->
            when (country `isAdjacentTo` other_country) $
                guard (color /= other_color)
        return $ (country,color):coloring
```

This function works by calling `foldM` (in `Control.Monad`) which in turn calls
`addCountryToColoring` once for each country (i.e., it *folds* over the list
`[1..number_of_countries]`), carrying along the current coloring. The function
`addCountryToColoring` does the following:

1. First, it makes a non-deterministic choice for the color of the country:

    ```haskell
    color <- between 1 number_of_colors
    ```

2. Second, it checks that all other countries that have been colored and which
   are adjacent to the current country are a different color:

    ```haskell
    forM_ coloring $ \(other_country, other_color) ->
        when (country `isAdjacentTo` other_country) $
            guard (color /= other_color)
    ```

    The first line of this snippet loops over the current coloring. The second
    line checks to see whether the current country (in the loop) is adjacent to
    the country we just colored, and if so then the third line checks that the
    two adjacent countries have different colors and fails if this is not the
    case.

3. Finally, it adds this country's color to the coloring, and returns the
   updated coloring (which will then be passed to `addCountryToColoring` at the
   next call, if any).

    ```haskell
    return $ (country,color):coloring
    ```

A major inefficiency in the code above is that a large number of the solutions
generated are equivalent in the sense that they only differ by a permutation of
the colors and the selection of the colors used (when this is less than the
total number of colors). In particular, if all `n` colors are used in a given
solution then there are `n!` equivalent solutions, if `n-1` of the `n` colors
are used then there are `n!/2` equivalent solutions, etc; in general if `m`
colors are used out of `n` for a given solution then there are `n!/(n-m)!`
equivalent solutions.

The solution to this is a trick I like to call "symmetry breaking", where you
take a symmetry (in this case, the fact that permuting the colors and/or
changing the choice of colors results in an equivalent solution) and factor it
out by forcing a particular ordering. The following code does this:

```haskell
coloringUniqueSolutions number_of_colors number_of_countries isAdjacentTo =
    liftM snd $ foldM addCountryToColoring (0,[]) [1..number_of_countries]
  where
    addCountryToColoring (number_of_colors_used,coloring) country = do
        color <- between 1 ((number_of_colors_used + 1) `min` number_of_colors)
        forM_ coloring $ \(other_country, other_color) ->
            when (country `isAdjacentTo` other_country) $
                guard (color /= other_color)
        return (number_of_colors_used `max` color,(country,color):coloring)
```

The above is a modified version of `coloringSolutions` where we *force* the
first color chosen to be 1, the second color chosen (if not the same as the
first) to be 2, and so on. Specifically, we keep track of the number of colors
used so far, and when the next color is chosen we restrict ourselves to these
colors plus the next greatest color if we have not already used all the
available colors; if our choice involves a new color then we bump up the number
of colors used for the next call to `addCountryToColoring`, otherwise we use the
same set of colors.


N-queens problem
----------------

Our final example is the n-queens problem, which is the problem of placing n
queens on an n x n board such that none of the queens conflict;  recall that in
chess two queens conflict if they share the same row, column, or diagonal.  A
function that generates solutions to this problem is as follows:

```haskell
import qualified Data.IntSet as IntSet
...
nqueensUsingSetsSolutions :: MonadPlus m => Word -> m [(Word,Word)]
nqueensUsingSetsSolutions n =
    go (fromIntegral n)
       0
       (IntSet.fromDistinctAscList [0..fromIntegral n-1])
       IntSet.empty
       IntSet.empty
       []
  where
    go 0 _ _ _ _ !value = return . reverse $ value
    go !n
       !row
       !available_columns
       !occupied_negative_diagonals
       !occupied_positive_diagonals
       !value
     = do
        column <- allFrom $ IntSet.toList available_columns
        let negative_diagonal = row + column
        guard $ IntSet.notMember negative_diagonal occupied_negative_diagonals
        let positive_diagonal = row - column
        guard $ IntSet.notMember positive_diagonal occupied_positive_diagonals
        go (n-1)
           (row+1)
           (IntSet.delete column available_columns)
           (IntSet.insert negative_diagonal occupied_negative_diagonals)
           (IntSet.insert positive_diagonal occupied_positive_diagonals)
           ((fromIntegral row,fromIntegral column):value)
```

(The use of `fromIntegral` comes from the fact that the input board size and
output board positions are naturally `Word`s as they cannot be negative but the
`IntSet` type only stores `Int`s, which means that we need to work internally
using `Int` and use `fromIntegral` to convert from the input `Word` and to the
output `Word`s.)

The function `go` is where most of the work happens.  For each row in the board,
it does the following:

1. First, it makes a non-deterministic choice from the available columns;  here
   we use the function `allFrom` (part of the `LogicGrowsOnTrees` module) to
   convert the list of available columns to a `MonadPlus` that generates it:

    ```haskell
    column <- allFrom $ IntSet.toList available_columns
    ```

2. Next, it checks if this choice of column conflicts with the positive and
   negative diagonals, and if so it backtracks and tries a different column:

    ```haskell
    let negative_diagonal = row + column
    guard $ IntSet.notMember negative_diagonal occupied_negative_diagonals
    let positive_diagonal = row - column
    guard $ IntSet.notMember positive_diagonal occupied_positive_diagonals
    ```

3. Finally, it recursively calls `go` for the next row with updated values for
   the updated available columns and occupied diagonals, as well as the chosen
   board position added to the (partial) solution:

    ```haskell
    go (n-1)
       (row+1)
       (IntSet.delete column available_columns)
       (IntSet.insert negative_diagonal occupied_negative_diagonals)
       (IntSet.insert positive_diagonal occupied_positive_diagonals)
       ((fromIntegral row,fromIntegral column):value)
    ```

When we are done, we reverse the solution as it currently has the last row
first and the first row last.

While `IntSet` is very efficient, it is even more efficient to use a bit field
for a set --- i.e., a field such that bits that are 1 correspond to being
occupied and those that are 0 correspond to being available. A solution using
this approach is as follows:

```haskell
nqueensUsingBitsSolutions n =
    go n 0 (0::Word64) (0::Word64) (0::Word64) []
  where
    go 0 _ _ _ _ !value = return . reverse $ value
    go !n
       !row
       !occupied_columns
       !occupied_negative_diagonals
       !occupied_positive_diagonals
       !value
     = do
        column <- allFrom . goGetOpenings 0 $
            occupied_columns .|. 
            occupied_negative_diagonals .|.
            occupied_positive_diagonals
        let column_bit = bit (fromIntegral column)
        go (n-1)
           (row+1)
           (occupied_columns .|. column_bit)
           ((occupied_negative_diagonals .|. column_bit) `shiftR` 1)
           ((occupied_positive_diagonals .|. column_bit) `shiftL` 1)
           ((row,column):value)

    goGetOpenings column bits
      | column >= n     = []
      | bits .&. 1 == 0 = column:next
      | otherwise       = next
      where
        next = goGetOpenings (column + 1) (bits `shiftR` 1)
```

Now the function `go` does the following:

1. It makes a non-deterministic choice within the current row for the column,
   excluding those spaces which are either occupied columns or diagonals:

    ```haskell
    column <- allFrom . goGetOpenings 0 $
        occupied_columns .|.
        occupied_negative_diagonals .|.
        occupied_positive_diagonals
    let column_bit = bit (fromIntegral column)
    ```

2. It marks the column and diagonals as being occupied, and also shifts the
   occupied diagonals so that they correspond with the columns in the next row.
   That is, if a given positive diagonal intersects with column `i` for a given
   row then it intersects with column `i+1` in the next row, and if a given
   negative diagonal intersects with column `i` for a given row then it
   intersects with `i-1` in the next row. Finally, it adds the board position to
   the partial solution:

    ```haskell
    go (n-1)
       (row+1)
       (occupied_columns .|. column_bit)
       ((occupied_negative_diagonals .|. column_bit) `shiftR` 1)
       ((occupied_positive_diagonals .|. column_bit) `shiftL` 1)
       ((row,column):value)
   ```

The new nested function `getOpenings` scans through the input bits and builds a
list of columns where a queen may be placed without conflict.  If there are no
such columns, then the list will be empty and the code will backtrack.

It is possible to use symmetry breaking gain a significant speed-up, though the
solution I came up with that does this ended up being quite complicated.  You
can see it for yourself in the `LogicGrowsOnTrees.Examples.Queens.Advanced`
module.


Using LogicGrowsOnTrees
=======================

In the preceding part, we gave some examples of how to write a logic program
using `MonadPlus`. In this part we will talk about how to use this package to
write such programs, and in particular how to run them in parallel. First we
will show how to use this package to run a logic program serially, which is done
for two reasons: first, to introduce the functionality of this package in a
simpler setting, and second, because it is useful when one is testing a program.
Next we will show how to run a logic program in parallel using the
`LogicGrowsOnTrees.Parallel.Adapter.Threads` module. Finally, we will show how
to run a logic program in parallel using the `LogicGrowsOnTrees.Parallel.Main`
module, which provides a universal interface to all adapters as well as
automating the work required to specify a command line interface.

Note that the examples in the preceding part have all been implemented in the
`LogicGrowsOnTrees.Examples.*` modules, so in the examples to follow we will
reference them by importing them for their corresponding module rather than by
copying and pasting their source code.

Serial
------

The central data structure in `LogicGrowsOnTrees` is the `Tree` type, which, as
the name suggests, is a tree data structure where each branch corresponds to a
binary choice made in a logic program by the `mplus` function, which takes two
trees and returns a new tree whose root is a choice between the two arguments.
We have not been using this function directly because we have wanted to make
choices from a list of elements, and so we used the `allFrom` function to
essentially convert this list to a tree using `mplus`.

Because `Tree` is an instance of `MonadPlus`, all of the logic programs we have
written are automatically available in `Tree` form. Thus, to run a logic program
in serial, you can use the `exploreTree*` family of functions in
`LogicGrowsOnTree`.

The `exploreTree` function explores the entire tree and sums over all the
results; for this reason, the result type of the tree needs to be a `Monoid`.
This means that in particular if you, say, want to get a list of all solutions,
then your logic program needs to put each result in a list singleton so that the
sum builds up the list of all solutions. The following program is an example of
applying this to prints a list of all the n-queens solutions for n = 5 (also
given in `tutorial/tutorial-1.hs`):

```haskell
import LogicGrowsOnTrees (exploreTree)
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = print . exploreTree . fmap (:[]) . nqueensUsingBitsSolutions $ 5
```

Note the use of `fmap (:[])` to replace every result generated by the logic
program with a singleton list containing the result. When there are a lot of
results it is better to use the `Seq` type in `Data.Sequence` as it has
(amortized) asymptotically faster concatenation operations; this is done in the
following (also given in `tutorial/tutorial-2.hs`):

```haskell
import qualified Data.Sequence as Seq
import LogicGrowsOnTrees (exploreTree)
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = print . exploreTree . fmap Seq.singleton . nqueensUsingBitsSolutions $ 5
```

Alternatively, if you are only interested in the *number* of solutions rather
than what they are, then you should replace every solution with `WordSum 1`,
where `WordSum` is a `Monoid` (included as part of this package in the module
`LogicGrowsTrees.Utils.WordSum`) with the property that the sum of two
`WordSum`s is a `WordSum` containing the sum of the two contained values; this
is done in the following (also given in `tutorial/tutorial-3.hs`):

```haskell
import LogicGrowsOnTrees (exploreTree)
import LogicGrowsOnTrees.Utils.WordSum (WordSum(..))
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = print . exploreTree . fmap (const $ WordSum 1) . nqueensUsingBitsSolutions $ 5
```

Note that the only change is that we replaced `fmap Seq.singleton` with
`fmap (const $ WordSum 1)`.

If you only want the first result then you should use `exploreTreeUntilFirst`,
which return the first result found wrapped in `Just` if any results are
present, and `Nothing` if no results were found; for example, see the following
(also given in `tutorial/tutorial-4.hs`):

```haskell
import LogicGrowsOnTrees (exploreTreeUntilFirst)
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = print . exploreTreeUntilFirst . nqueensUsingBitsSolutions $ 10
```

Finally, if you only want a few of the results, then use
`exploreTreeUntilFound`, which takes a condition function and will stop finding
new results when it is met; the result is a pair where the first component
contains the results that were found and the second contains a `Bool` indicating
whether the condition was met. Note that the returned results might be less than
those requested if there weren't enough found to meet your condition function,
and it also might be *more* than those requested because results are not found
one at a time but rather are merged from the bottom up, meaning that there might
be a choice point where the two branches separately did not meet the condition
but their merged results did. An example of using this to find at least three
results is illustrated as follows (also given in `tutorial/tutorial-5.hs`):

```haskell
import LogicGrowsOnTrees (exploreTreeUntilFound)
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main =
    print
    .
    exploreTreeUntilFound ((>= 3) . length)
    .
    fmap (:[])
    .
    nqueensUsingBitsSolutions
    $
    10
```

The above will print a pair where the first component has *five* solutions and
the second component is true. Note the condition function `((>= 3) . length)`
which computes the length of the list and checks whether it is at least three.

NOTE: If for some reason you really don't want more than, say, three solutions
--- perhaps because the solutions are very large and you never want to keep
around more than three --- then your best bet is to create a custom `Monoid`
type that, say, contains a list and never allows concatenation to let it grow
bigger than three elements.


Parallelization using the Threads module
----------------------------------------

The `LogicGrowsOnTrees.Parallel.Adapter.Threads` module provides functions that
let you run a logic program in parallel using multiple threads. You have a
couple of options for how to do this this.

First, you can use one of the many specialized functions which roughly follow
the same pattern as the `exploreTree*` functions in `LogicGrowsOnTrees`, such as
the following (also given in `tutorial/tutorial-6.hs`):

```haskell
import GHC.Conc (setNumCapabilities)

import LogicGrowsOnTrees.Parallel.Adapter.Threads
    (RunOutcome(..)
    ,TerminationReason(..)
    ,exploreTree
    ,setNumberOfWorkers
    )
import LogicGrowsOnTrees.Utils.WordSum (WordSum(..))
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = do
    setNumCapabilities 2
    RunOutcome statistics termination_reason <-
        exploreTree (setNumberOfWorkers 2)
        .
        fmap (const $ WordSum 1)
        .
        nqueensUsingBitsSolutions
        $
        10
    case termination_reason of
        Aborted progress -> putStrLn "Count aborted."
        Completed (WordSum count) -> putStrLn $ "Found " ++ show count ++ " solutions."
        Failure progress message -> putStrLn $ "Failed: " ++ message
```

First, observe that `exploreTree` now has an additional argument,
`setNumberOfWorkers 2`. This argument is called the *controller*, and is a loop
that is run that lets you issue commands to the supervisor such as aborting,
requesting a progress update, and changing the number of workers (which can be
done at any time in the run and can even bring the number down to zero).
`setNumberOfWorkers` is a function that changes the number of workers to be
equal to its argument, spawning or killing workers as necessary.

In order for the two worker threads to run in parallel, two things need to
happen. First, you need to compile with the `-threaded` option, and second, you
need to set the number of capabilities to two so that up to two threads can run in
parallel, as is done in the first line of the body of `main`:

```haskell
setNumCapabilities 2
```

(Alternatively, you could also use the `+RTS -N#` command-line option to set the
number of capabilities to `#`.)

Now observe that in the first line of the body of the main function, we have

```haskell
RunOutcome statistics termination_reason <-
```

Now that we are running many workers in parallel, the result of the exploration
is a bit more complicated. The result type is a `RunOutcome`, which contains the
run statistics and the termination reason. The run statistics contain a lot of
information whose primary purpose is to help one diagnose why one is not getting
the appropriate speedup as the number of workers increases (should this happen).
The termination reason contains information about why the run terminated. As you
can see at the end of the code, there are three possibilities:

```haskell
case termination_reason of
    Aborted progress -> putStrLn "Count aborted."
    Completed (WordSum count) -> putStrLn $ "Found " ++ show count ++ " solutions."
    Failure progress message -> putStrLn $ "Failed: " ++ message
```

First, we have `Aborted`, which means that a request was made to abort the run;
it contains the `progress` that had been made up to that point, which can be
used to resume the run at the same point later.

Second, we have `Completed`, which means that the run terminated normally; it
contains the final result in the run, which in this case is a `Word` wrapped in
a `WordSum`.

Finally, we have `Failure`, which indicates that something went horribly wrong
during the run, such as an exception being thrown; it contains both the
`progress` that had been made up to that point and also a `message` that
describes what happened. If your logic program is pure, then this most likely
means that there is a bug somewhere in your program. (If it is not pure, which
we have not covered in this tutorial, a `Failure` might just mean that, say, an
external resource that is needed by the program was not available.)

In the case of both `Aborted` and `Failure`, there is an argument which
represents the current progress of the exploration, which you can use as the
starting point by calling the `exploreTreeStartingFrom` function.

There is an important caveat, however: it only makes sense to resume an
exploration using a checkpoint *if you have not changed the program*, because in
general if you change the program, then you change the tree, which means that
the checkpoint is no longer a valid; in particular, if the explored part of the
tree changes, then in general your current sum over results will no longer
correct, and if the shape of the tree changes, then in general the checkpoint
will not line up with it and will raise an error --- in fact, if you make a
mistake and change the parts of the tree that have been explored and resume from
a checkpoint anyway then you should *hope* that an error is raised as the
alternative is for your results to be silently corrupted!

Because of this, it will rarely make sense to resume from a `Failure` if your
program is pure, because an exception will almost always signal the presence of
a bug. The main reason for including the `progress` with the `Failure` is
because, although we have not discussed this, it is possible to write logic
programs that run in the I/O monad and require access to, say, a database
server; if the database server goes down, it makes perfect sense to restart it
and resume the run from the last `progress`.

In the following code, we show a (somewhat contrived) example of resuming after
aborting, as well as of a non-trivial controller (also given in
`tutorial/tutorial-7.hs`):

```haskell
import Control.Monad.IO.Class (liftIO)
import Data.Monoid (mempty)
import GHC.Conc (setNumCapabilities)
import System.Exit (exitFailure,exitSuccess)

import LogicGrowsOnTrees.Checkpoint (Progress(..))
import LogicGrowsOnTrees.Parallel.Adapter.Threads
    (RunOutcome(..)
    ,TerminationReason(..)
    ,abort
    ,exploreTreeStartingFrom
    ,requestProgressUpdate
    ,setNumberOfWorkers
    )
import LogicGrowsOnTrees.Utils.WordSum (WordSum(..))
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = setNumCapabilities 2 >> go mempty
 where
  go progress@(Progress _ (WordSum count)) = do
    putStrLn $ "Counting... (starting with " ++ show count ++ " solutions); press <Enter> to abort"
    RunOutcome statistics termination_reason <-
        exploreTreeStartingFrom
            progress
            (do setNumberOfWorkers 2
                _ <- liftIO $ getLine
                _ <- requestProgressUpdate
                abort
            )
        .
        fmap (const $ WordSum 1)
        .
        nqueensUsingBitsSolutions
        $
        14
    case termination_reason of
        Aborted progress -> do
            putStrLn $ "Count aborted; will try again."
            go progress
        Completed (WordSum count) -> do
            putStrLn $ "Found " ++ show count ++ " solutions."
            exitSuccess
        Failure _ message -> do
            putStrLn $ "Failed: " ++ message
            exitFailure
```

(Note:  If this code takes too much or too little time to finish on your
computer then you can adjust the problem size, which is currently set to 14.)

This code features a number of differences from the previous example.  First we
note that the controller is non-trivial:

```haskell
(do setNumberOfWorkers 2
    _ <- liftIO $ getLine
    _ <- requestProgressUpdate
    abort
)
```

Whereas previously the controller just set the number of workers and quit, it
now instead waits for the user to press enter, and if the user does so, then the
controller tells the supervisor that it should perform a progress update ---
that is, that it should contact all the workers and fetch their current results
and checkpoints --- and then finally the controller tells the supervisor to
abort.

Next, note that the `main` function is a loop:

```haskell
main = setNumCapabilities 2 >> go mempty
 where
  go progress@(Progress _ (WordSum count)) = do
    ...
    case termination_reason of
        Aborted progress -> do
            putStrLn $ "Count aborted; will try again."
            go progress
        Completed (WordSum count) -> do
            putStrLn $ "Found " ++ show count ++ " solutions."
            exitSuccess
        Failure _ message -> do
            putStrLn $ "Failed: " ++ message
            exitFailure
```

Specifically, as long as the user keeps aborting by pressing enter, the code
will loop and immediately resume starting from its progress at the time the run
was aborted.  If the run finishes by either terminating successfully or with a
failure, then the program exits.  You don't have to worry about making sure that
the controller terminates because if the run terminates for any reason then it
kills the controller thread rather than leaving it hanging.

Obviously this is not a particularly useful controller, although it does
demonstrate that the run can be arbitrarily aborted and restarted from a
checkpoint without suffering any problems. For an example of a more useful
controller, see the following (also given in `tutorial/tutorial-8.hs`):

```haskell
import Control.Monad (forever)
import Control.Monad.IO.Class (liftIO)
import System.IO (hFlush,stdout)

import LogicGrowsOnTrees.Parallel.Adapter.Threads
    (RunOutcome(..)
    ,TerminationReason(..)
    ,exploreTree
    ,setNumberOfWorkers
    )
import LogicGrowsOnTrees.Utils.WordSum (WordSum(..))
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = do
    RunOutcome _ termination_reason <-
        exploreTree (forever $
            liftIO (do
                putStr "Enter the desired number of workers: "
                hFlush stdout
                readLn
            )
            >>=
            setNumberOfWorkers
        )
        .
        fmap (const $ WordSum 1)
        .
        nqueensUsingBitsSolutions
        $
        14
    case termination_reason of
        Aborted progress -> putStrLn "Count aborted."
        Completed (WordSum count) -> putStrLn $ "Found " ++ show count ++ " solutions."
        Failure _ message -> putStrLn $ "Failed: " ++ message
```

Now the controller continually polls the user for the desired number of workers,
and then changes the number of workers to be equal to it, which, for example,
you could use to adjust the number of processors being used by the run to be
larger or smaller depending on how many processors you want to use for other
tasks at that moment.

(Unfortunately, calling `setNumCapabilities` many times in succession can
destabilize the GHC runtime, so for this example you will need to use `+RTS -N#`
on the command-line to set the number of capabilities to be equal to the largest
number of workers that you will want to run in parallel; it is worth mentioning
that the `Processes` adapter, provided in the `LogicGrowsOnTrees-procesess`
package, also features a controller that can change the number of workers, but
does not suffer from this problem because it creates and destroyers workers by
spawning and destroying single-threaded processes rather than telling the GHC
threaded runtime to grow and shrink the number of capabilities.)

As in the serial case, there are multiple modes in which an exploration can be
run.  The following code only looks for the first result (also given in
`tutorials/tutorial-9.hs`):

```haskell
import GHC.Conc (setNumCapabilities)

import LogicGrowsOnTrees.Parallel.Adapter.Threads
    (RunOutcome(..)
    ,TerminationReason(..)
    ,exploreTreeUntilFirst
    ,setNumberOfWorkers
    )
import LogicGrowsOnTrees.Checkpoint (Progress(..))
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = do
    setNumCapabilities 2
    RunOutcome statistics termination_reason <-
        exploreTreeUntilFirst (setNumberOfWorkers 2)
        .
        nqueensUsingBitsSolutions
        $
        10
    case termination_reason of
        Aborted _ -> putStrLn "Search aborted."
        Completed Nothing -> putStrLn "No result found."
        Completed (Just (Progress checkpoint result)) -> putStrLn $ "Found " ++ show result
        Failure _ message -> putStrLn $ "Failed: " ++ message
```

Note how now `Completed` contains a `Maybe` value:

```haskell
Completed Nothing -> putStrLn "No solution found."
Completed (Just (Progress checkpoint result)) -> putStrLn $ "Found " ++ show result
```

If the run finds no solution, then it returns `Nothing`. If it does return a
solution, then it returns a `Progress` value, which contains not only the
solution but also the `checkpoint`; the reason for returning the `checkpoint` is
that it allows you to resume from it if you decide that you want to find more
solutions at some point in the future.

As in the serial case, you can also request that only some of the results be
found, as in the following code which looks for at least five solutions (also
given in `tutorials/tutorial-10.hs`):

```haskell
import GHC.Conc (setNumCapabilities)

import LogicGrowsOnTrees.Parallel.Adapter.Threads
    (RunOutcome(..)
    ,TerminationReason(..)
    ,exploreTreeUntilFoundUsingPush
    ,setNumberOfWorkers
    )
import LogicGrowsOnTrees.Checkpoint (Progress(..))
import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main = do
    setNumCapabilities 2
    RunOutcome statistics termination_reason <-
        exploreTreeUntilFoundUsingPush
            ((>= 5) . length)
            (setNumberOfWorkers 2)
        .
        fmap (:[])
        .
        nqueensUsingBitsSolutions
        $
        10
    case termination_reason of
        Aborted _ -> putStrLn "Search aborted."
        Completed (Left results) -> putStrLn $ "Only found:" ++ show results
        Completed (Right (Progress checkpoint results)) -> putStrLn $ "Found: " ++ show results
        Failure _ message -> putStrLn $ "Failed: " ++ message
```

Now the result in `Completed` is `Left results` if the run ended before the
condition function was satisfed and `Right (Progress checkpoint results)`
otherwise, where again the `checkpoint` allows you to resume the search at some
point in the future if you wish.

There is also an `exploreTreeUntilFoundUsingPull` function, which is similar to
this function except that it gathers the results in a different way. The
difference between them is that `exploreTreeUntilFoundUsingPull` sums results
locally on each worker until either a progress update is requested or the
condition is satisfied, whereas `exploreTreeUntilFoundUsingPush` pushes each
result to the supervisor immediately as soon as it is found. If you are only
looking for a few results then `exploreTreeUntilFoundUsingPush` is preferable
because the whole system will know right away when the desired results have been
found. If you are looking for a large number of results then the overhead of
sending each result to the supervisor may add up to the point where it is better
to sum locally and only send results to the supervisor periodically; note that
if you take the latter approach then it is your responsibility to have the
controller periodically request progress updates. (Note that the
`LogicGrowsOnTrees.Parallel.Common.RequestQueue` module has a `fork` function
that you can use to spawn another controller thread if this would make your life
easier;  like the main controller thread, it will be killed when the run is
over.)


Parallelization using the Main framework
----------------------------------------

`Threads` is one of the *adapters* provided by `LogicGrowsOnTrees` and its
siblings. Each of these adapters provides a way of adapting the
supervisor/worker parallelization model to a particular means of running
computations in parallel. The current adapters are as follows:

* `Threads`

    This adapter provides parallelism by spawning multiple threads; the number
    of workers can be changed arbitrarily at runtime (though you need to make
    sure that the number of capabilities is also high enough for all of them to
    run in parallel). This adapter is the only one that requires the threaded
    runtime, which adds additional overhead.

* `Processes`

    This adapter provides parallelism by spawning a child process for each
    worker;  the number of workers can be changed arbitrarily at runtime.

    Install `LogicGrowsOnTrees-processes` to use this adapter.

* Network

    This adapter provides parallelism by allowing multiple workers to connect to
    a supervisor over a network; the number of workers is then equal to the
    number that are are connected to the supervisor. (It is possible for the
    same process to be both a supervisor and one or more workers.)

    Install `LogicGrowsOnTrees-network` to use this adapter.

* `MPI`

    This adapter provides parallelism using the Message Passing Interface (MPI),
    which is the standard communication system used in supercomputers, allowing
    you to use a very large number of nodes in your run. One of the nodes (#0)
    will act entirely as the supervisor, and the rest will act as workers.

    Install `LogicGrowsOnTrees-MPI` to use this adapter; note that you will need
    to have an MPI implementation installed (such as
    [OpenMPI](http://www.open-mpi.org/)).

All of these adapters provide low-level means of accessing their functionality
directly if you wish (though they are much more complicated to use than the
`exploreTree` functions in `Threads`), but there is also a universal high-level
interface that works for *all* of the adapters, which we will now discuss.

The `Main` module provides a framework that automates a lot of the work of
setting up and running an exploration in parallel, and the interface it provides
is completely agnostic about the adapter that is used; the `simpleMainFor*` and
`mainFor*` functions all take an argument which is the `driver` of the adapter
that you are using, and so switching to a different adapter is as simple as
switching the `driver` argument.

Here is a simple example of using the `Main` framework (also given in
`tutorial/tutorial-11.hs`):

```haskell
import LogicGrowsOnTrees.Parallel.Adapter.Threads (driver)
import LogicGrowsOnTrees.Parallel.Main (RunOutcome(..),TerminationReason(..),simpleMainForExploreTree)
import LogicGrowsOnTrees.Utils.WordSum (WordSum(..))

import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main =
    simpleMainForExploreTree
        driver
        (\RunOutcome _ termination_reason -> do
            case termination_reason of
                Aborted _ -> error "search aborted"
                Completed (WordSum count) -> putStrLn $ show count ++ " solutions were found"
                Failure _ message -> error $ "error: " ++ message
        )
        (fmap (const $ WordSum 1) (nqueensUsingBitsSolutions 10))
```

This program comes with an automatically generated help screen (via. `--help`),
and it already includes options to specify the location of the checkpoint file
(if it exists, then the run will be resumed from it), how often a checkpoint
should be written, at what level to print logging messages, and whether various
server statistics should be printed to the screen (possibly useful if your
computation is not scaling well). Because we are using the `Threads` driver,
there will also be a `-n` option to set the number of threads.

The reason why the function that processes the result of the run has be
specified as an argument rather than having the result be returned by
`simpleMainForExploreTree` is twofold: first, because in general the supervisor
and workers will be in separate processes and the result is only processed by
the supervisor, and second, because this allows `simpleMainForExploreTree` to be
sure that your code has successfully processes the result (as opposed to, say,
throwing an exception) before deleting the checkpoint file.

The following is a more complicated example that uses a more general function
in `Main` that lets one specify the board size using a command-line argument
(also given in `tutorial/tutorial-12.hs`):

```haskell
import System.Console.CmdTheLine (PosInfo(..),TermInfo(..),defTI,pos,posInfo,required)

import LogicGrowsOnTrees.Parallel.Adapter.Threads (driver)
import LogicGrowsOnTrees.Parallel.Main (RunOutcome(..),TerminationReason(..),mainForExploreTree)
import LogicGrowsOnTrees.Utils.WordSum (WordSum(..))

import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main =
    mainForExploreTree
        driver
        (required $
            pos 0
                (Nothing :: Maybe Int)
                posInfo
                  { posName = "BOARD_SIZE"
                  , posDoc = "the size of the board"
                  }
        )
        (defTI
            { termName = "tutorial-11"
            , termDoc = "count the number of n-queens solutions for a given board size"
            }
        )
        (\board_size (RunOutcome _ termination_reason) -> do
            case termination_reason of
                Aborted _ -> error "search aborted"
                Completed (WordSum count) -> putStrLn $ show count ++ " solutions found for board size " ++ show board_size
                Failure _ message -> error $ "error: " ++ message
        )
        (fmap (const $ WordSum 1) . nqueensUsingBitsSolutions . fromIntegral)
```

This program calls `mainForExploreTree` with the following arguments:

1. the `driver`, which in this case was imported from `Threads`

2.  a `Term` which specifies that our program takes a single required positional
    argument for the board size:

    ```haskell
    (required $
        pos 0
            (Nothing :: Maybe Int)
            posInfo
              { posName = "BOARD_SIZE"
              , posDoc = "the size of the board"
              }
    )
    ```

    Most of the functions above are part of
    [`cmdtheline`](http://hackage.haskell.org/package/cmdtheline), an
    applicative command-line parsing library. This library was used because it
    makes it easy to compose options together; your argument value here will
    essentially be merged in with the adapter options and some generic options
    (such as the checkpointing options).

    Specifically, `pos` here is a function that takes a position, a default
    value, and a `PosInfo` data structure that contains information about the
    name of the option and a brief description of it; the result is a value of
    type `Arg (Maybe Int)`. `required` then takes this term and maps it to a
    value of type `Term Int` with the property that an error is raised if this
    positional argument is not present.

3. a `TermInfo` which specifies the name and a short description of this
   program:
    
    ```haskell
            (defTI
                { termName = "tutorial-11"
                , termDoc = "count the number of n-queens solutions for a given board size"
                }
            )
    ```

4. an action to be executed with the final result:

    ```haskell
    (\board_size (RunOutcome _ termination_reason) -> do
        case termination_reason of
            Aborted _ -> error "search aborted"
            Completed (WordSum count) -> putStrLn $ show count ++ " solutions found for board size " ++ show board_size
            Failure _ message -> error $ "error: " ++ message
    )
    ```

    The first argument to this function is equal to the value supplied by the
    user for the first command line argument.

    NOTE: When `Completed`, any existing checkpoint file will be deleted after
    this function returns *unless* an exception is thrown, in which case it is
    kept around.

5. a function that constructs the logic program:

    ```haskell
    (fmap (const $ WordSum 1) . nqueensUsingBitsSolutions . fromIntegral)
    ```

    The argument to this function is equal to the value supplied by the user
    for the first command line argument.

Again, there is additional complexity in this interface because, as the
supervisor and workers will in general be in different processes, the
configuration information needs to be sent from the supervisor to the worker
processes so that they can locally construct the tree. The `driver` automates
the mechanism for this.

Finally, it is worth noting that all that it takes to use multiple processes
instead of multiple threads is to install `LogicGrowsOnTrees-processes` and then
replace `Threads` with `Processes` in the imports.

For a slightly more sophisticated example, consider the following (also given in
`tutorial/tutorial-13.hs`):

```haskell
import Control.Applicative (liftA2)
import System.Console.CmdTheLine (PosInfo(..),TermInfo(..),defTI,pos,posInfo,required)

import LogicGrowsOnTrees.Checkpoint (Progress(..))
import LogicGrowsOnTrees.Parallel.Adapter.Threads (driver)
import LogicGrowsOnTrees.Parallel.Main (RunOutcome(..),TerminationReason(..),mainForExploreTreeUntilFoundUsingPush)
import LogicGrowsOnTrees.Utils.WordSum (WordSum(..))

import LogicGrowsOnTrees.Examples.Queens (nqueensUsingBitsSolutions)

main =
    mainForExploreTreeUntilFoundUsingPush
        (\(board_size,number_to_find) -> (>= number_to_find) . length)
        driver
        (liftA2 (,)
            (required $
                pos 0
                    (Nothing :: Maybe Int)
                    posInfo
                      { posName = "BOARD_SIZE"
                      , posDoc = "the size of the board"
                      }
            )
            (required $
                pos 1
                    (Nothing :: Maybe Int)
                    posInfo
                      { posName = "#"
                      , posDoc = "the number of solutions to find"
                      }
            )
        )
        (defTI
            { termName = "tutorial-12"
            , termDoc = "find some of the solutions to the n-queens problem for a given board size"
            }
        )
        (\(board_size,number_to_find) (RunOutcome _ termination_reason) -> do
            case termination_reason of
                Aborted _ -> error "search aborted"
                Completed (Left found) -> do
                    putStrLn $ "For board size " ++ show board_size ++ ", only found " ++ show (length found) ++ "/" ++ show number_to_find ++ " solutions:"
                    mapM_ print found
                Completed (Right (Progress checkpoint found)) -> do
                    putStrLn $ "Found all " ++ show number_to_find ++ " requested solutions for board size " ++ show board_size ++ ":"
                    mapM_ print found
                Failure _ message -> error $ "error: " ++ message
        )
        (fmap (:[]) . nqueensUsingBitsSolutions . fromIntegral . fst)
```

This differs from the previous example in two main respects: first, it prints
out solutions rather than just their count, and second there is a second
argument to the program that specifies how many should be found.

To understand what is going on, let us first look at the third argument to
`mainForExploreTreeUntilFoundUsingPush`:

```haskell
(liftA2 (,)
    (required $
        pos 0
            (Nothing :: Maybe Int)
            posInfo
              { posName = "BOARD_SIZE"
              , posDoc = "the size of the board"
              }
    )
    (required $
        pos 1
            (Nothing :: Maybe Int)
            posInfo
              { posName = "#"
              , posDoc = "the number of solutions to find"
              }
    )
)
```

This argument essentially takes the argument from the previous example,
duplicates it to create a second argument (the number of solutions to find), and
then merges the two terms together in `Applicative` style via a call to
`liftA2`. After parsing is complete, the result will be a pair where the first
value is the board size and the second value is the number of solutions to find.

Now we look a the first argument to `mainForExploreTreeUntilFoundUsingPush`:

```haskell
(\(board_size,number_to_find) -> (>= number_to_find) . length)
```

This argument is a function that takes the configuration information and returns
a condition function that indicates where enough results have been accumulated.
In this case, the condition function checks whether the number of solutions
found (obtained via. `length`) is at least as many as were requested
(`number_to_find`).  Note that `board_size` is ignored and would probably
normally be replaced by `_` (or possibly by use of higher-order functions to
make the expression entirely point-free);  we include it here purely for
pedagogical reasons.

Next we look at the last argument of `mainForExploreTreeUntilFoundUsingPush`:

```haskell
(fmap (:[]) . nqueensUsingBitsSolutions . fromIntegral . fst)
```

This is just like in the previous example, save that now at the end there is
`fst`, which takes the first value in the configuration pair (which is the board
size) and instead of replacing each solution with a `WordSum`, it turns it into
a singleton list.

Finally, we look at the second-to-last argument:

```haskell
(\(board_size,number_to_find) (RunOutcome _ termination_reason) -> do
    case termination_reason of
        Aborted _ -> error "search aborted"
        Completed (Left found) -> do
            putStrLn $ "For board size " ++ show board_size ++ ", only found " ++ show (length found) ++ "/" ++ show number_to_find ++ " solutions:"
            mapM_ print found
        Completed (Right (Progress checkpoint found)) -> do
            putStrLn $ "Found all " ++ show number_to_find ++ " requested solutions for board size " ++ show board_size ++ ":"
            mapM_ print found
        Failure _ message -> error $ "error: " ++ message
)
```

The difference compared to the previous example is that there are two cases for
a `Completed` run.  In the first case, the run fully completed before it was
able to find all of the requested number of solutions;  the solutions it did
find are returned in a `Left` value.  In the second case, the run found all of
the requested solutions and then stopped;  the result is a `Progress` value
whose `checkpoint` value allows you to resume the search later to find more
solutions if you wish and whose result value is the requested solutions.


Conclusion
==========

At this point you have learned how to write logic programs and how to use this
package to write them in parallel. For more information, see
[USERS_GUIDE.md](USERS_GUIDE.md) for a more detailed discussion of some aspects
of this package as well as the package haddock documentation for reference.