Introduction ============ `LogicGrowsOnTrees` revolves around the `Tree` datatype, which is an instance of `MonadPlus` with the additional ability for one to cache values that are expensive to compute (as resuming from a checkpoint and parallelizing both can cause paths in the tree to be explored multiple times); see the documentation for `MonadExplorable` for more information on this. In the sections to follow we will present the various functionalities in this package for exploring trees. First we will discuss how to explore trees serially, and then we will discuss how to explore them in parallel; for the parallel case there are multiple options that will be discussed separately. Finally, we will discuss how to create new adapters. Serial exploration ================== If you want to explore a `Tree` serially (and without using checkpointing) then you can use the `exploreTree` family of functions in the `LogicGrowsOnTrees` module. There are 6 functions, which is the result of there being three different exploration modes, each of which in turn has with pure and impure variants. The default exploration mode, `All`, explores the entire tree and sums over all of the results at the leaves of the tree; to use this mode your `Tree` needs to generate results that are an instance of the `Monoid` class. The reason for this is that sometimes you want to do things like counting the number of solutions rather than generating the list of them, and it is much more efficient to be able to do this directly rather than to generate a (possibly very large) list which you only use by taking its length. If you do want to generate a list of solutions, then you can use `fmap (:[])` on the `Tree` to turn each result into a singleton list, though if you have a large number of results then you should instead create singletons of the `Seq` type in `Data.Sequence` as this type has (amortized) asymptotically faster concatenation operations. The second mode, `First`, explores the tree until it has found the first result, at which point it returns this result wrapped in `Just`; if no result is found, then it returns `Nothing`. Note that in this case the result does not have to be an instance of `Monoid`, as there is no sum being performed. The third mode, `Found`, explores the tree summing over results until a given condition function is satisfied; like `All` mode, the results have to be an instance of `Monoid`. The returned value is all the results that were found, as well as a flag indicating whether the condition function was ever satisfied. For each of the modes above, there is a variant for when you have a pure `Tree`, and a variant for when have a `TreeT`, which is a monad transformer; in the latter case the exploration functions will return their value inside the monad nested in the `TreeT`. Parallel exploration ==================== Although this package can be used to explore trees serially, it really shines when you want to perform an exploration in parallel. In the first subsection we will describe the worker/supervisor model used by this package for parallelization. Following that, we will briefly cover the various parallel exploration modes and tree purities. Next, we will discuss the various adapters that are available, and in particular how to use the `Threads` adapter. Finally, we will discuss how to use the `Main` infrastructure, including run outcomes. Supervisors and workers ----------------------- `LogicGrowsOnTrees` uses a supervisor/worker model for parallelization. That is, at any given time there is a supervisor that keeps track of the global state of the exploration and zero or more workers that are exploring the tree in parallel --- zero is a valid number because some adapters allow for the number of workers to change during the run. The supervisor functionality is given in the `LogicGrowsOnTrees.Parallel.Common.Supervisor` module. Normally you will not be using it directly but rather you will be using a provided adapter which builds on top of it to provide a simplified, specialized interface. The `LogicGrowsOnTrees` package provides a `Threads` adapter (parallelization via threads); other packages provide a `Processes` adapter (parallelization via processes), a `Network` adapter (parallelization via zero or more processes connecting over the network to the supervisor), and an `MPI` (Message Passing Interface) adapter. The worker functionality is given in the `LogicGrowsOnTrees.Parallel.Common.Worker` module. As with the `Supervisor` module, you will not normally need to use this module directly. Exploration modes ----------------- The modes in which a parallel exploration can be run are given by the `ExplorationMode` type in the `LogicGrowsOnTrees.Parallel.Common.ExplorationMode` module. Many of the functions in the parallelization infrastructure come in specialized families where there is a function for each mode (and purity), and if you use these then you might never need to deal with `ExplorationMode` directly; nonetheless it is good to know what values it can take anyway, as the constructor names are used as suffixes in specialized functions. The first mode is `AllMode`; functions specialized to this mode have no suffix. It acts just like the `All` mode discussed in the serial exploration section, i.e. it sums over all results (which must therefore be an instance of `Monoid`). The second mode is `FirstMode`; functions specialized to this mode have the suffix `UntilFirst`. It acts just like the `First` mode discussed in the serial exploration section, i.e. the exploration terminates when the first result has been found (which need not be an instance of `Monoid`). The third mode is `FoundModeUsingPull`; functions specialized to this mode have the suffix `UntilFoundUsingPull`. It acts like the `Found` mode discussed in the serial exploration section, i.e. it sums over all results found until a criterion specified by the argument to the `FoundModeUsingPull` constructor is satisfied. All found results are kept locally at the various workers running in parallel and only merged when the supervisor sends out a global progress update request that "pulls" all of the results in the system to it. Because of this, it is possible that the workers will have collectively found enough results to satisfy the criteria, but the system as a whole will not know this until a progress update has been performed; for this reason, if you are using this mode, then you need to perform a progress update on a regular basis. The last mode is `FoundModeUsingPush`; functions specialized to this mode have the suffix `UntilFoundUsingPush`. This mode functions exactly like `FoundModeUsingPull` except that every result found is sent straight to the supervisor, which means that the very instant that the system finds the desired results, it will know about it. The potential downside to this mode is that there is a small amount of overhead incurred in sending the result to the supervisor, and so if there are a large number of results it might be more efficient to accumulate them locally with the occasional pull rather than to send each result to the supervisor as it is found. Purity ------ The `Purity` of a tree indicates whether it is `Pure` or `ImpureAtopIO`. If a tree is `Pure` then it has no side effects (more precisely, the nested monad is the `Identity`). If a tree is `ImpureAtopIO`, then it has side-effects, and furthermore it has `IO` as the base monad in the stack, where this latter restriction comes from the fact that the worker needs to run in a monad that has `IO` at the base; `ImpureAtopIO` takes a single parameter that indicates how to run the given action in the `IO` monad. The families of specialized functions actually have *three* cases: `Pure`, `Impure`, and `IO`, where `IO` is a special case of `Impure` provided for convenience. The functions accepting an impure tree also have an additional parameter to specify how to run it in the `IO` monad. Adapters -------- An adapter module provides a way of adapting the supervisor/worker parallelization model to a particular means of running computations in parallel. The current adapters are as follows: * `Threads` This adapter provides parallelism by spawning multiple threads; the number of workers can be changed arbitrarily at runtime (though you need to make sure that the number of capabilities is also high enough for all of them to run in parallel). This adapter is the only one that requires the threaded runtime, which adds additional overhead. * `Processes` This adapter provides parallelism by spawning a child process for each worker; the number of workers can be changed arbitrarily at runtime. Install `LogicGrowsOnTrees-processes` to use this adapter. * Network This adapter provides parallelism by allowing multiple workers to connect to a supervisor over a network; the number of workers is then equal to the number that are are connected to the supervisor. (It is possible for the same process to be both a supervisor and one or more workers.) Install `LogicGrowsOnTrees-network` to use this adapter. * `MPI` This adapter provides parallelism using the Message Passing Interface (MPI), which is the standard communication system used in supercomputers, allowing you to use a very large number of nodes in your run. One of the nodes (#0) will act entirely as the supervisor, and the rest will act as workers. Install `LogicGrowsOnTrees-MPI` to use this adapter; note that you will need to have an MPI implementation installed (such as [OpenMPI](http://www.open-mpi.org/)). All of these modules offer 'low-level' interfaces that are more complicated to use but which give you more control. To use these interfaces, look for the functions named `runExplorer`, `runSupervisor`, and `runWorker` (except for `Threads`, which only has `runExplorer`). Through the `Main` framework, the above also offer high-level interfaces, but before we discuss this it is worth discussing the `Threads` module, which offers a higher-level direct interface than the others. Threads ------- The `Threads` adapter offers a higher-level interface than the others, mainly because it does everything within a single process and so does not have to worry about things like whether the current process is worker of the supervisor. In the Threads module, there are `3 x 4 x 2 = 24` specialized functions, which have the naming convention `exploreTreeXYZ` where: * `X` is empty for `AllMode` and takes the form `UntilM` for any other mode `M`; * `Y` is empty for `Pure` trees, `IO` for trees running in `IO`, and `Impure` for general impure trees; and * `Z` is `StartingFrom` if starting from an initial progress and empty otherwise. All of these functions take a *controller*, which is a function that has the ability to make requests to the supervisor. You would use this, for example, if you wanted to write out checkpoint file on a regular basis; specifically, you would do this by setting up a timer that regularly calls `requestProgressUpdateAsync` (which tells the supervisor to request a progress update from all active workers and then returns the current progress), and then writing this progress to a file. All of the functions you can call in the controller are exported by the `Threads`, and they include the ability to fork new threads which can make life easier. Main ---- The `Main` framework is designed to make your life easier by automating things like checkpointing. It provides a universal interface to all adapters through a driver system whereby all of the `mainForExploreTree` functions have a `driver` parameter which you import from the adapter that you want to use. This is the *only* parameter that depends on the adapter, so if you want to switch to using a different adapter you only need to change your adapter module import so that the driver is pulled from the desired adapter. The main functions that you will be interested in are the `simpleMainForExploreTreeXY` and `mainForExploreTreeXY` functions, where `X` corresponds to the purity (empty for `Pure`) and `Y` corresponds to the exploration mode (empty for `All`). Unlike threads, there is not argument to specify a starting progress because instead this will be derived from a command-line option specifying the checkpoint file; if it exists, then it is used as the starting point. The `simpleMainForExploreTreeXY` functions provide a relatively simple interface that only requires the driver, a function to process the result when the run has completed, and the logic program. The `mainForExploreTreeXY` functions provide a more complicated interface that allows you to add command-line arguments and options whose values you can use to define the tree being explored, how many solutions to find, etc. All of these functions take a `Term` argument and a `TermInfo` argument. The latter just specifies a brief name and description for your program; it should suffice to glance at [TUTORIAL.md](TUTORIAL.md) to see what to do. The former specifies the command-line arguments and/or options that your particular logic program needs to, for example, specify the size of the problem (such as the board size in the n-queens problem); for simple cases it should again suffice to glance at the tutorials, but in general you may need to learn how to use [`cmdtheline`](http://hackage.haskell.org/package/cmdtheline). [`cmdtheline`](http://hackage.haskell.org/package/cmdtheline) is used is because it provides a means of combining arguments and options from several sources, which in this case includes the configuration for your logic program, configuration for the adapter (such as the number of threads or processes), and configuration that is used by the `Main` module itself (such as the checkpoint file and frequency, and whether to print various server statistics). The compiled program will also have a nice `--help` page. After the command line arguments have been parsed, the configuration information for the term you provided will be passed as an argument to the functions that you provide in other arguments. (Again, see the tutorial for examples of this.) Outcomes -------- Once the run has finished, a `RunOutcome` will be returned that is one of the following three possibilities: * `Aborted` means that a request was made to abort the run. * `Completed` means that the run terminated normally. * `Failure` means that an unexpected error occurred. It is worth noting that `Completed` doesn't necessarily mean that the run was successful; if you asked for the first solution and no solutions were found, then the result will be `Nothing`. Likewise if you (using one of the `Found` modes) ask for `k` results but fewer than `k` were found, then the result will be `Left` with the results that were found. If only one or a few results are requested and they were all found, then the result will also have the progress so you can resume the run at a later time to find more results. If the outcome is `Completed`, then it is your responsibility to do what needs to be done with the results, as the checkpoint file will be deleted unless your code throws an exception. **WARNING:** You should almost never resume from a checkpoint if you change the tree! This is only safe if the only parts of the tree that have been changed are those that have not yet been explored. If you do change parts of the tree that have been explored, then if you are lucky an exception will be thrown (if the branching structure has changed) and if you are unlucky, then your results will be silently corrupted. Writing an adapter ================== `LogicGrowsOnTrees` contains a lot of functionality that automates much of the generalizable work in writing an adapter. Most of the modules providing this functionality live in `LogicGrowsOnTrees.Parallel.Common`, so henceforth any module brought up in this discussion should be assumed to live there unless stated otherwise. The main module is `Supervisor`, which is essentially a big state machine that keeps track of all the workers and the progress that has been made. Your adapter essentially acts as an intermediary between the supervisor and the workers, relaying information from the workers to the supervisors and vice versa. The way that this works is that you run your main loop in the `SupervisorMonad`; you communicate to the supervisor by calling the appropriate functions that act in this monad, and it communicates with you by calling functions from a set of callbacks that you give it when starting the run. So for example, at the beginning of the run you might call `addWorker` to tell the supervisor that a worker has just been added to the system, and immediately after this the supervisor will call your `sendWorkloadToWorker` callback function to ask you to send it the workload. If your adapter allows the number of workers to be adjusted arbitrarily at runtime, then you should look at the `Workgroup` module which is designed exactly for that case. You use it by calling `runWorkgroup` and passing in an argument containing callbacks that the `Workgroup` uses to do things like creating and destroying workers; your argument is a function that takes a data structure that tells you how to send messages to the supervisor. See the `Threads` and `Processes` adaptors for examples of how to do this. If you are not using `Workgroup`, then you will need to write the main loop yourself. Part of your job is to provide is for the controller to communicate to the supervisor. Functionality for doing this is contained within the `RequestQueue` monad. You use it by first calling `newRequestQueue` to create a new request queue, and then processing any requests that are sent to it. Your main loop can work in one of two ways: it can either continuously poll for communications from workers, or it can block waiting for a communication. Each of these modes corresponds to a `SupervisorProgram`; the former is a `PollingProgram` and the latter is a `BlockingProgram`. You are expected to use one of these two patterns rather than writing your own loop explicitly because they allow the server to keep track of what fraction of the time it is busy; in principle you can always use `UnrestrictedProgram`, but in this case it is your responsibility to call `beginSupervisorOccupied` when the supervisor becomes active (i.e., after having waited for an incoming communication) and to call `endSupervisorOccupied` when the supervisor becomes inactive (i.e., as it waits for another incoming communication). If your main loop is polling, then you also need to regularly poll the request queue and send any request there to the supervisor. If your main loop is blocking, then an easy option is for you to use `requestQueueProgram` instead of your own loop, and then to have incoming communication be handled by sending the actions you want to run in response to the request queue. (The owner of the request queue can send any `SupervisorMonad` action to it, but the controller has to go through the `RequestQueueMonad`, which restricts the actions that they can run to a subset so that the controller can't do things like adding and removing workers.) Your adapter also has responsibility for running the controller. You should do this using the `forkControllerThread` function in `RequestQueue`, because this automatically adds the forked `ThreadId` to the list of controller thread ids. The controller is free to fork additional threads, and these will automatically be added to this list as well. When you are done, you should kill all of these threads by calling `killControllerThreads`. Thus far we have only talked about the supervisor side, but you will also be responsible for starting and communicating with workers. Most of the time you will probably want to use either the `runWorker` or `runWorkerUsingHandles` functions in the `Process` module, with the former taking actions that send and receive messages and the latter taking handles for sending and receiving (the latter calls the former). These functions provide a loop that listens for and responds to messages from the supervisor. When a workload is received, they spawn a new worker thread; the original thread then forwards requests for stolen workloads and progress updates to the worker thread. When the worker thread completes, it sends the result back to the supervisor. The only adapter that does not use the `Process` module is the `Threads` adapter; this is because, unlike the other adapters, the worker threads in this case run in the same process as the supervisor. Thus, it is possible to talk directly to the worker threads rather than creating an intermediary. In such cases, you will want to use the `Worker` module; call `forkWorkerThread` to spawn a worker thread, and then use `sendAbortRequest`, `sendProgressUpdateRequest`, and `sendWorkloadStealRequest` on the request queue returned by `forkWorkerThread` to communicate directly with the worker thread; it is designed to poll the queue on a regular basis, though it does so merely by reading an `IORef`, which might mean that it will take a while to receive the request as the CPU caches synchronize. (This was a deliberate design decision to minimize the overhead of polling the request queue.) It is good practice to package the functionality offered by your adapter under a `runSupervisor` function to be run on the supervisor process, a `runWorker` function to be run on worker processes, and a `runExplorer` function that is run on *both* kinds of nodes and automatically figures out if it this node is a supervisor or a worker. Once you have written these functions, you should also write a driver for your adapter. This is a function that takes a `DriverParamters` and runs the exploration. These parameters are as follows: * `purity` This gives the purity of the tree being explored. * `shared_configuration_term` This is a configuration term whose information is shared by both the supervisor and the workers; you may require that its type be serializable. * `supervisor_configuration_term` This is a configuration term whose information is only available to the supervisor. * `program_info` This is a `TermInfo` that the users use to customize the name and brief description of their program in the help screen. * `initializeGlobalState` This function is called on *all* processes with the shared configuration; its job is to initialize process-specific settings such as the logging level. * `constructExplorationMode` This function is called on both the supervisor and the workers; it takes the shared configuration and uses it to compute the exploration mode. (This is needed because the `Found` modes take a function argument which, for example, checks that enough results have been generated, which will in general be based on a command line argument.) * `constructTree` This function is only called on workers; it takes the shared configuration and constructs the tree to be explored. * `getStartingProgress` This function is only called on the supervisor; it takes *both* the shared and supervisor configuration and returns the starting progress of the run, which could be the result of reading a checkpoint file. * `notifyTerminated` This function is only called on the supervisor; it takes *both* the shared configuration and the supervisor configuration, and processes the outcome of the run. * `constructController` This function is only called on the supervisor; it takes *both* the shared configuration and the supervisor configuration, and constructs the controller that you are expected to run on the supervisor. There is not a single pattern to follow from here because different adapters can be very different how they glue the various bits together. For example, the `Processes` adapter's driver looks at its command line arguments for a sentinel value in order to determine whether it is the supervisor or a worker, the `Network` adapter's driver uses uses the first command line argument to specify whether it is a supervisor or a worker that needs to connect over the network to a supervisor, and the MPI adapter checks whether it is process number 0, in which case it is the supervisor, or some other process number, in which case it is a worker. In most cases you will likely need to send the shared configuration out to the workers. For examples of how to write your own adapter, your best bet is to look at the source code of the four adapters that have been provided. Conclusion ========== This concludes the Users's Guide. For more information, see [TUTORIAL.md](TUTORIAL.md) for lots of examples of how to write logic programs and run them in parallel, as well as the package haddock documentation for reference.