harg: Haskell program configuration using higher kinded data

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

Please see the README on GitHub at https://github.com/alexpeits/harg#readme

[Skip to Readme]

Properties

Versions	0.1.0.0, 0.1.0.1, 0.1.3.0, 0.2.0.0, 0.2.0.0, 0.3.0.0, 0.4.0.0, 0.4.1.0, 0.4.2.0, 0.4.2.1, 0.5.0.0
Change log	CHANGELOG.md
Dependencies	aeson (>=1.4.2 && <1.5), barbies (>=1.1.0 && <1.2), base (>=4.7 && <5), bytestring (>=0.10.8 && <0.11), directory (>=1.3.3 && <1.4), higgledy (>=0.3.0 && <0.4), optparse-applicative (>=0.14.3 && <0.15), split (>=0.2.3 && <0.3), text (>=1.2.3 && <1.3), yaml (>=0.11.0 && <0.12) [details]
License	BSD-3-Clause
Copyright	Copyright (c) 2019 Alex Peitsinis
Author	Alex Peitsinis
Maintainer	alexpeitsinis@gmail.com
Category	System, CLI, Options, Parsing, HKD
Home page	https://github.com/alexpeits/harg
Bug tracker	https://github.com/alexpeits/harg/issues
Source repo	head: git clone https://github.com/alexpeits/harg
Uploaded	by alexpeits at 2019-09-06T16:18:55Z

Modules

[Index] [Quick Jump]

Downloads

harg-0.2.0.0.tar.gz [browse] (Cabal source package)
Package description (as included in the package)

Maintainer's Corner

Package maintainers

alexpeits

For package maintainers and hackage trustees

edit package information

Readme for harg-0.2.0.0

[back to package description]

harg

harg is a library for configuring programs by scanning command line arguments, environment variables, default values and more. Under the hood, it uses a subset of optparse-applicative to expose regular arguments, switch arguments and subcommands. The library relies heavily on the use of higher kinded data (HKD) thanks to the barbies library. Using higgledy also allows to have significantly less boilerplate code.

The main goal while developing harg was to not have to go through the usual pattern of manually mappending the results of command line parsing, env vars and defaults.

Usage

tl;dr: Take a look at the example.

Here are some different usage scenarios. Let's first enable some language extensions and add some imports:

{-# LANGUAGE DataKinds          #-}
{-# LANGUAGE DeriveAnyClass     #-}
{-# LANGUAGE DeriveGeneric      #-}
{-# LANGUAGE FlexibleInstances  #-}
{-# LANGUAGE GADTs              #-}
{-# LANGUAGE KindSignatures     #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeApplications   #-}
{-# LANGUAGE TypeOperators      #-}

import           Data.Function         ((&))
import           Data.Functor.Identity (Identity (..))
import           Data.Kind             (Type)
import           GHC.Generics          (Generic)

import qualified Data.Barbie           as B
import           Data.Aeson            (FromJSON)
import           Data.Generic.HKD      (HKD, build, construct)

import           Options.Harg

main :: IO ()
main = putStrLn "this is a literate haskell file"

One flat (non-nested) datatype

The easiest scenario is when the target configuration type is one single record with no levels of nesting:

data FlatConfig
  = FlatConfig
      { _fcDbHost :: String
      , _fcDbPort :: Int
      , _fcDir    :: String
      , _fcLog    :: Bool  -- whether to log or not
      }
  deriving (Show, Generic)

(The Generic instance is required for section 3 later on)

Let's first create the Opts for each value in FlatConfig. Opt is the description for each component of the configuration.

dbHostOpt :: Opt String
dbHostOpt
  = toOpt ( option strParser
          & optLong "host"
          & optShort 'h'
          & optMetavar "DB_HOST"
          & optHelp "The database host"
          )

dbPortOpt :: Opt Int
dbPortOpt
  = toOpt ( option readParser
          & optLong "port"
          & optHelp "The database port"
          & optEnvVar "DB_PORT"
          & optDefault 5432
          )

dirOpt :: Opt String
dirOpt
  = toOpt ( argument strParser
          & optHelp "Some directory"
          & optDefault "/home/user/something"
          )

logOpt :: Opt Bool
logOpt
  = toOpt ( switch
          & optLong "log"
          & optHelp "Whether to log or not"
          )

Here, we use option to define a command line argument that expects a value after it, argument to define a standalone argument, not prefixed by a long or short indicator, and switch to define a boolean command line flag that, if present, sets the target value to True. The opt* functions (here applied using & to make things look more declarative) modify the option configuration. optHelp adds help text, optDefault adds a default value, optShort adds a short command line option as an alternative to the long one (the string after option or switch), optEnvVar sets the associated environment variable and optMetavar sets the metavariable to be shown in the help text generated by optparse-applicative.

toOpt turns any kind of option into the internal Opt type. The reason for doing this is that different types of options can have different capabilities, e.g. long and short cannot be set for an argument. Another shorthand is to use the with variants. For example, dbHostOpt could also be defined like this:

dbHostOpt' :: Opt String
dbHostOpt'
  = optionWith strParser
      ( optLong "host"
      . optShort 'h'
      . optMetavar "DB_HOST"
      . optHelp "The database host"
      )

The first argument (strParser or readParser) is the parser for the argument, be it from the command line or from an environment variable. The type of this function should be String -> Either String a, which produces an error message or the parsed value. strParser is equivalent to pure and always succeeds. readParser requires the type to have a Read constraint. In order to use it with newtypes that wrap a type that has a Read constraint, using the Functor instance for Opt should be sufficient. E.g. for the newtype:

newtype Port = Port Int

we can define the following option:

dbPortOpt' :: Opt Port
dbPortOpt'
  = Port <$> dbPortOpt

Of course, any user-defined function works as well. In addition, to use a function of type String -> Maybe a use parseWith, which runs the parser and in case of failure uses a default error message. For example, readParser is defined as parseWith readMaybe.

Finally, an optional option with type a can be specified by setting its type to Maybe a. The declaration is exactly the same as it would be for a, and adding optOptional to the modifiers turns turns the parser from String -> Either String a to String -> Either String (Maybe a) but without using the Read instance for Maybe:

someOpt :: Opt (Maybe Int)
someOpt
  = toOpt ( option readParser
          & optLong "something"
          & optOptional
          )

Note that optOptional can't be used with optDefault. Using them together raises a type error at compile time, to ensure there's no ambiguous behaviour (e.g. the order of declaration of modifiers should not influence the resulting option).

There are 3 ways to configure this datatype.

1. Using a `barbie` type

barbie types are types of kind (Type -> Type) -> Type. The barbie type for FlatConfig looks like this:

data FlatConfigB f
  = FlatConfigB
      { _fcDbHostB :: f String
      , _fcDbPortB :: f Int
      , _fcDirB    :: f String
      , _fcLogB    :: f Bool
      }
  deriving (Generic, B.FunctorB, B.TraversableB, B.ProductB)

I also derived some required instances that come from the barbies package. These instances allow us to change the f (bmap from FunctorB) and traverse all types in the record producing side effects (btraverse from TraversableB).

Now let's define the value of this datatype, which holds our option configuration. The type constructor needed for the options is Opt:

flatConfigOpt1 :: FlatConfigB Opt
flatConfigOpt1
  = FlatConfigB dbHostOpt dbPortOpt dirOpt logOpt

Because dbHostOpt, dbPortOpt and logOpt all have type Opt <actual type>, flatConfigOpt1 has the correct type according to FlatConfigB Opt.

Now to actually run things:

getFlatConfig1 :: IO ()
getFlatConfig1 = do
  FlatConfigB host port dir log <- execOptDef flatConfigOpt1
  print $ runIdentity (FlatConfig <$> host <*> port <*> dir <*> log)

execOpt returns an Identity x where x is the type of the options we are configuring, in this case FlatConfigB. Here, we pattern match on the barbie-type, and then use the Applicative instance of Identity to get back an Identity FlatConfig.

This is still a bit boilerplate-y. Let's look at another way.

2. Using a product type

Looking at FlatConfigB, it's only used because of it's barbie-like capabilities. Other than that it's just a simple product type with the additional f before all its sub-types.

harg defines a type almost similar to Product (from Data.Functor.Product), which works in a similar fashion as servant's :<|> type. This type is defined in Options.Harg.Het.Prod and is called :* (the * stands for product). This type stores barbie-like types and also keeps the f handy: data (a :* b) f = a f :* b f. This is also easily made an instance of Generic, FunctorB, TraversableB and ProductB. With all that, let's rewrite the options value and the function to get the configuration:

flatConfigOpt2 :: (Single String :* Single Int :* Single String :* Single Bool) Opt
flatConfigOpt2
  = single dbHostOpt :* single dbPortOpt :* single dirOpt :* single logOpt

getFlatConfig2 :: IO ()
getFlatConfig2 = do
  host :* port :* dir :* log <- execOptDef flatConfigOpt2
  print $ runIdentity
    (FlatConfig <$> getSingle host <*> getSingle port <*> getSingle dir <*> getSingle log)

This looks aufully similar to the previous version, but without having to write another datatype and derive all the instances. :* is both a type-level constructor and a value-level function that acts like list's :. It is also right-associative, so for example a :* b :* c is the same as a :* (b :* c).

The Single type constructor is used when talking about a single value, rather than a nested datatype. Single a f is a simple newtype over f a. The reason for using that is simply to switch the order of application, so that we can later apply the f (here Opt) to the compound type (:*). This makes type definitions look more similar to datatype definitions:

type FlatConfigOpt2
  =  Single String
  :* Single Int
  :* Single Bool

In addition, single is used to wrap an f a into a Single a f, and getSingle is used to unwrap it. Later on we'll see how to construct nested configurations using Nested.

However, the real value when having flat datatypes comes from the ability to use higgledy.

3. Using `HKD` from `higgledy`

flatConfigOpt3 :: HKD FlatConfig Opt
flatConfigOpt3
  = build @FlatConfig dbHostOpt dbPortOpt dirOpt logOpt

getFlatConfig3 :: IO ()
getFlatConfig3 = do
  result <- execOptDef flatConfigOpt3
  print $ runIdentity (construct result)

This is the most straightforward way to work with flat configuration types. The build function takes as arguments the options (Opt a where a is each type in FlatConfig) in the order they appear in the datatype, and returns the generic representation of a type that's exactly the same as FlatConfigB. This means that we get all the barbie instances for free.

To go back from the HKD representation of a datatype to the base one, we use construct. construct uses the applicative instance of the f which wraps each type in FlatConfig to give back an f FlatConfig (in our case an Identity FlatConfig).

Nested datatypes

Let's say now that we have these two datatypes:

data DbConfig
  = DbConfig
      { _dcHost :: String
      , _dcPort :: Int
      }
  deriving (Show, Generic)

data ServiceConfig
  = ServiceConfig
      { _scPort :: Int
      , _scLog  :: Bool
      }
  deriving (Show, Generic)

And the datatype to be configured is this:

data Config
  = Config
      { _cDb      :: DbConfig
      , _cService :: ServiceConfig
      , _cDir     :: String
      }
  deriving (Show, Generic)

And a new option required for the service port:

portOpt :: Opt Int
portOpt
  = toOpt ( option readParser
          & optLong "port"
          & optHelp "The service port"
          & optDefault 8080
          )

Again, there are several ways to configure these options.

1. Using `barbie` types

Since we now have 3 types, there's a bit more boilerplate to write:

data ConfigB f
  = ConfigB
      { _cDbB      :: DbConfigB f
      , _cServiceB :: ServiceConfigB f
      , _cDirB     :: f String
      }
  deriving (Generic, B.FunctorB, B.TraversableB, B.ProductB)

data DbConfigB f
  = DbConfigB
      { _dcHostB :: f String
      , _dcPortB :: f Int
      }
  deriving (Generic, B.FunctorB, B.TraversableB, B.ProductB)

data ServiceConfigB f
  = ServiceConfigB
      { _scPortB :: f Int
      , _scLogB  :: f Bool
      }
  deriving (Generic, B.FunctorB, B.TraversableB, B.ProductB)

To define the option parser, we need option parsers for every type inside it. This was true for flat configs too, but we have to manually construct a DbConfigB Opt and ServiceConfigB Opt:

configOpt1 :: ConfigB Opt
configOpt1
  = ConfigB dbOpt serviceOpt dirOpt

dbOpt :: DbConfigB Opt
dbOpt
  = DbConfigB dbHostOpt dbPortOpt

serviceOpt :: ServiceConfigB Opt
serviceOpt
  = ServiceConfigB portOpt logOpt

And to run the parser:

getConfig1 :: IO ()
getConfig1 = do
  ConfigB (DbConfigB dbHost dbPort) (ServiceConfigB port log) dir <- execOptDef configOpt1
  let
    db      = DbConfig <$> dbHost <*> dbPort
    service = ServiceConfig <$> port <*> log
  print $ runIdentity (Config <$> db <*> service <*> dir)

2. Using `higgledy`

higgledy puts an f before every type, so doing something like HKD Config f doesn't make sense: looking at ConfigB it seems like the f needs to go to the right hand side of the nested types. We can, however, avoid the boilerplate of defining barbie types for the nested datatypes:

data ConfigH f
  = ConfigH
      { _cDbH      :: HKD DbConfig f
      , _cServiceH :: HKD ServiceConfig f
      , _cDirH     :: f String
      }
  deriving (Generic, B.FunctorB, B.TraversableB, B.ProductB)

configOpt2 :: ConfigH Opt
configOpt2
  = ConfigH dbOptH serviceOptH dirOpt

dbOptH :: HKD DbConfig Opt
dbOptH
  = build @DbConfig dbHostOpt dbPortOpt

serviceOptH :: HKD ServiceConfig Opt
serviceOptH
  = build @ServiceConfig portOpt logOpt

And to run the parser:

getConfig2 :: IO ()
getConfig2 = do
  ConfigH db service dir <- execOptDef configOpt2
  print $ runIdentity (Config <$> construct db <*> construct service <*> dir)

2. Using products

Recall from previously that there's the Single type which in general turns f b into b f. This means that, by using Single for the directory option, all fs are after their types, so we can just use :* instead of having to declare a new datatype:

type ConfigP
  =  HKD DbConfig
  :* HKD ServiceConfig
  :* Single String

configOpt3 :: ConfigP Opt
configOpt3
  = dbOptH :* serviceOptH :* single dirOpt

getConfig3 :: IO ()
getConfig3 = do
  db :* service :* dir <- execOptDef configOpt3
  print $ runIdentity (Config <$> construct db <*> construct service <*> getSingle dir)

And, to make things look more orthogonal, harg defines a type called Nested, which is exactly the same as HKD. There are functions that correspond to build and construct, too:

Nested    <-> HKD
nested    <-> build
getNested <-> construct

This means that the previous code block might as well be:

type ConfigP'
  =  Nested DbConfig
  :* Nested ServiceConfig
  :* Single String

configOpt4 :: ConfigP' Opt
configOpt4
  = dbOptN :* serviceOptN :* single dirOpt
  where
    dbOptN
      = nested @DbConfig dbHostOpt dbPortOpt
    serviceOptN
      = nested @ServiceConfig portOpt logOpt

getConfig4 :: IO ()
getConfig4 = do
  db :* service :* dir <- execOptDef configOpt4
  print $ runIdentity (Config <$> getNested db <*> getNested service <*> getSingle dir)

Pretty cool.

Subcommands

harg also supports (somewhat limited) subcommands, again by using optparse-applicative underneath.

Because of limitations with higher kinded data when it comes to sum types, harg uses a different way to define subcommands. optparse-applicative allows defining subcommands that result to the same type, which means the user needs to define a sum type, and each subcommand results in a different constructor. In contrast, harg defines subcommands that can return completely different types. Instead of the result being a sum type, where the user has to pattern match on constructors, the result is a Variant, which is defined (almost) like this:

data Variant (xs :: [Type]) where
  Here :: x -> Variant (x ': xs)
  There :: Variant xs -> Variant (y ': xs)

Variant is like a sum type which holds all the summands in a type-level list. Instead of pattern matching in Left or Right like when using Either, we pattern match on Here x, There (Here x) etc. For a pretty thorough introduction to Variant and more heterogeneous types, check out this repo by i-am-tom.

x :: Variant '[Int, Bool, Char]
x = There (Here True)

run :: Variant '[Int, Bool, Char] -> Maybe Bool
run (Here _)                 = Nothing
run (There (Here b))         = Just b
run (There (There (Here _))) = Nothing

-- > run x
-- Just True

harg defines another kind of variant called VariantF:

data VariantF (xs :: [(Type -> Type) -> Type]) (f :: Type -> Type) where

to hold a type-level list of barbie types and the f to wrap every type with.

To define a type to be used in a subcommand parser we need the target type and the subcommand name, which is encoded as a type-level string Symbol. There's a handy way to define this. Suppose that the the Config type above is the configuration type when the command is app and another type, e.g.TestConfig is the configuration when the command is test:

data TestConfig
  = TestConfig
      { _tcFoo :: String
      , _tcBar :: Int
      }
  deriving Show

fooOpt :: Opt String
fooOpt
  = toOpt ( option strParser
          & optShort 'f'
          & optHelp "Something foo"
          & optDefault "this is the default foo"
          )

barOpt :: Opt Int
barOpt
  = toOpt ( option readParser
          & optShort 'b'
          & optHelp "Something bar"
          & optDefault 42
          )

type TestConfigP
  = Single String :* Single Int

testConfigOpt :: TestConfigP Opt
testConfigOpt
  = single fooOpt :* single barOpt

The subcommand type looks like this:

type SubcommandConfig
  =  "app" :-> ConfigP'
  :+ "test" :-> TestConfigP

The + here stands for sum. The associated option type is:

subcommandOpt :: SubcommandConfig Opt
subcommandOpt
  = configOpt4 :+ testConfigOpt :+ ANil

The ANil here marks the end of the association list (which is a heterogeneous list that associates symbols with types).

Here's how to run this parser:

getSubcommand :: IO ()
getSubcommand = do
  result <- execCommandsDef subcommandOpt
  case result of
    HereF (db :* service :* dir)
      -> print $ runIdentity
       $ Config <$> getNested db <*> getNested service <*> getSingle dir
    ThereF (HereF (foo :* bar))
      -> print $ runIdentity
       $ TestConfig <$> getSingle foo <*> getSingle bar

Or use fromVariantF, which is similar to the either function:

getSubcommand' :: IO ()
getSubcommand' = do
  result <- execCommandsDef subcommandOpt
  fromVariantF result
    (\(db :* service :* dir)
       -> print $ runIdentity
       $ Config <$> getNested db <*> getNested service <*> getSingle dir
    )
    (\(foo :* bar)
       -> print $ runIdentity
       $ TestConfig <$> getSingle foo <*> getSingle bar
    )

The type of fromVariantF can be thought of as being:

fromVariantF
  :: VariantF '[a, b, c, ...] f
  -> (a f -> r)
  -> (b f -> r)
  -> (c f -> r)
  -> ...
  -> r

The signature will accept the appropriate number of functions depending on the length of the type level list.

More than just environment variables

You may have noticed the use of execOptDef and execCommandsDef in all of the examples up to now. There are actually more configurable versions of these, called execOpt and execCommands respectively. With these functions the user can select where to get options from. For example, execOptDef is a shorthand for execOpt EnvSource, which means that options will be fetched from environment variables only (along with the command line, which is always required, and defaults, which can be optionally provided by the user).

The sources currently supported are environment variables, json and yaml files.

Configuring using a json file

First of all, let's use FlatConfig from the first example:

data FlatConfig
  = FlatConfig
      { _fcDbHost :: String
      , _fcDbPort :: Int
      , _fcDir    :: String
      , _fcLog    :: Bool  -- whether to log or not
      }
  deriving (Show, Generic)

dbHostOpt :: Opt String
dbHostOpt
  = toOpt ( option strParser
          & optLong "host"
          & optShort 'h'
          & optMetavar "DB_HOST"
          & optHelp "The database host"
          )

dbPortOpt :: Opt Int
dbPortOpt
  = toOpt ( option readParser
          & optLong "port"
          & optHelp "The database port"
          & optEnvVar "DB_PORT"
          & optDefault 5432
          )

dirOpt :: Opt String
dirOpt
  = toOpt ( argument strParser
          & optHelp "Some directory"
          & optDefault "/home/user/something"
          )

logOpt :: Opt Bool
logOpt
  = toOpt ( switch
          & optLong "log"
          & optHelp "Whether to log or not"
          )

flatConfigOpt3 :: HKD FlatConfig Opt
flatConfigOpt3
  = build @FlatConfig dbHostOpt dbPortOpt dirOpt logOpt

To use the JSON source, a FromJSON instance is required. Thankfully that's easy, since FlatConfig has Generic instance:

instance FromJSON FlatConfig

In harg, sources are defined as products (using :*) of options, which means that the definition of the sources is not very different than defining options! If we only needed the environment variable source, the options would be:

envSource :: EnvSource Opt
envSource = EnvSource

There's no need to actually define an option for the environment because there's no meaningful configuration for this. To use the EnvSource along with a json config, we use the following option:

sourceOpt :: (EnvSource :* JSONSource) Opt
sourceOpt
  = EnvSource :* JSONSource jsonOpt
  where
    jsonOpt :: Opt ConfigFile
    jsonOpt
      = toOpt ( option strParser
              & optLong "json"
              & optShort 'j'
              & optHelp "JSON config filepath"
              )

Here, the type of the option for the JSON source is ConfigFile. This type is a wrapper around FilePath, which looks like this:

data ConfigFile
  = ConfigFile FilePath
  | NoConfigFile

This has the advantage that, if the user wants to specify an optional configuration file, they can simply say:

jsonOpt :: Opt ConfigFile
jsonOpt
  = toOpt ( option strParser
          & optLong "json"
          & optDefault NoConfigFile
          )

Also, because ConfigFile has an IsString instance, there's no need to say optLong (ConfigFile "json") (if OverloadedStrings is enabled).

There's a bit of a disconnect between ConfigFile and the ability to make optional options using Maybe and optOptional. The reason for it is that the type that JSONSource wraps is not polymorphic, since it needs to be a filepath specifically.

Roadmap

Better errors using optparse-applicative's internals
Allow user to pass optparse-applicative preferences
Write tests
~~Be able to provide and get back the same type for multiple subcommands~~
~~Integrate config files (e.g. JSON using aeson)~~

Key	Shortcut
s	Open this search box
esc	Close this search box
↓,ctrl + j	Move down in search results
↑,ctrl + k	Move up in search results
↵	Go to active search result