##What is this?
This is a massively breaking revision of the application interface of
configurator. The configuration file syntax is backward compatible,
and mostly forward compatible as well. This fork is not (yet?) intended
for widespread public consumption. Rather, this repo is being used as
a stopgap measure in some of my own projects as well as a playground
and laboratory for a new configurator-like package that may be
released sometime in the future.
Manifesto
(Note that this section is mildly aspirational at a few points,
and/or contains errors.)
The application interface of configurator
has numerous problems:
- it makes it easy to introduce race conditions
- it makes it difficult to write an application that is relatively
robust to misconfiguration errors
- does not scale well to moderately complex configuration scenarios
- the configuration change notifications are particularly difficult
to use beyond the most trivial of use cases.
The aim of configurator-ng
is to improve these issues, with the
initial efforts focused on the first three. I hope to make more
correct solutions easier, and less correct solutions harder, all
wrapped up in a more expressive interface.
Race conditions
The interface of configurator
basically is:
data Config = Config (IORef (HashMap Text Value))
lookup :: Configured a => Config -> Text -> IO (Maybe a)
The IORef
is there to support configuration file reloading, which
is often done automatically. So this results in the race condition:
do
key0 <- lookup config "key0"
reload config {- in another thread -}
key1 <- lookup config "key1"
return (key0, key1)
Thus, we have taken key0
and key1
from two versions of the
configuration files, with a overall result that is not necessarily
consistent with either version.
There is a way to solve this race condition*, though it is by no
means convenient and it provides even less support for turning the
result into configuration parameters:
getMap :: Config -> IO (HashMap Text Value)
This obtains a consistent* snapshot of the configuration, from which
you can pull out multiple values. But in addition to being
less obvious and inconvenient, the fact that the HashMap
returned
is not an abstract type makes means that changing the representation
breaks client code that uses this approach.
configurator-ng
makes the latter mode of use much more convenient by
introducing ConfigParser
s, a applicative/monadic high-level parsing
interface to read configuration info from a single snapshot. See the
module Data.Configurator.Parser
. The basic ideas behind the revised
interface is as follows:
data ConfigCache = ConfigCache (IORef Config)
readConfig :: ConfigCache -> IO Config
runParser :: ConfigParser m => m a -> Config -> (Maybe a, [ConfigError])
(Here, ConfigError
could be an error condition, or it might be more
analogous to a warning or informational message; thus a parser can
return a result and some ConfigError
s.)
Finally, we could define a ConfigParser
to read from key1 and key2
by writing:
getKeys :: ConfigParser m => m (Text, Int)
getKeys = (,) <$> key "key0" <*> key "key1"
(*It's important to point out that getMap
only avoids introducing
additional race conditions; commonly used filesystems are racey
software artifacts, so this is only consistent relative to filesystem
reads. For a complete solution, one would have to take care in the
precise filesystem calls used to manipulate the configuration file(s).
Most popular text editors should be ok as far as the consistency of a
single file, consistent reads of multiple files is trickier.)
Configuration validation
Another advantage of the ConfigParser
interface is that it makes it
easier and more convenient to validate a (sub-)configuration as an
entirety, and thus also make more intelligent decisions about what to do
in cases of misconfigurations. For example, one might want to continue
running on the last known good configuration, and raise a big red
flag in a monitoring solution. The goal is to provide mechanism,
not policy.
Greater Expressive Power
Consider the following use case: you have an event processor, that
watches several named sources for events. You might like your
configuration file to look something like this:
event-sources {
amazon-cloud {
postgres {
host = "cloudevents.mydomain.com"
port = 5433
dbname = "eventdb"
sslmode = "verify-full"
sslcert = "${HOME}/credentials/pgclient.crt"
sslkey = "${HOME}/credentials/pgclient.key"
}
heartbeat-interval = 15
heartbeat-timeout = 15
}
chicago-service-center {
postgres {
host = "pgevents.customerdomain.com"
port = 5433
dbname = "eventdb"
sslmode = "verify-full"
sslcert = "${HOME}/credentials/pgclient.crt"
sslkey = "${HOME}/credentials/pgclient.key"
}
heartbeat-interval = 15
heartbeat-timeout = 15
}
}
Now, amazon-cloud
and chicago-service-center
are names of the source
useful for whatever purposes (logging, API endpoints, etc), that the
event processor doesn't know about in advance. Since configurator
is tied down to HashMap
, the data structure offers no support for
efficiently discovering these names. In order to fix this,
configurator-ng
moved to critbit
. which allows us to
efficiently iterate over these keys (in alphabetical order). So
configurator-ng
offers the following operator:
subgroups :: ConfigParser m => Text -> m [Text]
subgroups
returns the non-empty value groupings of it's argument,
so for example when evaluated in the context of the configuration above:
subgroups "" ==> [ "event-sources" ]
subgroups "event-sources" ==> [ "event-sources.amazon-cloud"
, "event-sources.chicago-service-center" ]
Another issue is that there's a lot of redundancy here, so maybe we'd like to
refactor the configuration file into something like this:
event-sources {
amazon-cloud {
postgres.host = "cloudevents.mydomain.com"
}
chicago-service-center {
postgres.host = "pgevents.customerdomain.com"
}
default {
postgres {
port = 5433
dbname = "eventdb"
sslmode = "verify-full"
sslcert = "${HOME}/credentials/pgclient.crt"
sslkey = "${HOME}/credentials/pgclient.key"
}
heartbeat-interval = 15
heartbeat-timeout = 15
}
}
So now the problem is that we want to turn this configuration into a
list of EventSource
s:
data EventSource = EventSource {
name :: !Text,
libpqConnParams :: [(Text,Value)],
heartbeatInterval :: !Micro,
heartbeatTimeout :: !Micro,
}
Now, even ignoring the issue of the names mentioned above, handling
this sort of customizable defaulting in configurator
would be rather
painful. But it's actually quite easy with configurator-ng
:
{-# LANGUAGE ApplicativeDo, RecordWildCards #-}
mapA :: Applicative f => (a -> f b) -> [a] -> f [b]
mapA f = foldr (liftA2 (:)) (pure []) . map f
eventSources :: ConfigParserA [EventSource]
eventSources = do
localConfig (subconfig "event-sources") $ do
mapA eventSource . filter (/= "default") <$> subgroups ""
eventSource :: Text -> ConfigParserA EventSource
eventSource name = do
localConfig (union (subconfig name )
(subconfig "default")) $ do
libpqConnParams <- localConfig (subconfig "postgres") (subassocs "")
heartbeatInterval <- key "heartbeat-interval"
heartbeatTimeout <- key "heartbeat-timeout"
pure $! EventSource{..}
This example uses the ConfigParserA
variant of ConfigParser
, so
that the parser continues to run after encountering an error in order
to generate more error messages. It also uses localConfig
operator
to run a subparser in a different configuration context. There are
a few operators for modifying the configuration context:
localConfig :: ConfigParser m => ConfigTransform -> m a -> m a
data ConfigTransform -- Conceptually, type ConfigTransform = Config -> Config
instance Monoid ConfigTransform
-- mempty is identity transformation
-- mappend is composition of transformations
-- | Left-biased union of two configurations
union :: ConfigTransform -> ConfigTransform -> ConfigTransform
-- | Restrict a configuration to a given group, and remove that group
-- prefix from all key names.
subconfig :: Text -> ConfigTransform
-- | Add a group name as a prefix to all key names
superconfig :: Text -> ConfigTransform
Note that these operators are implemented "symbolically", so that
they run in sub-linear (Possibly O(1)
?) time. Instead, the cost of
these are paid on each (key,value)
lookup.
Syntactic extensions
Datum comments have been implemented, not unlike Scheme and Clojure.
The configurator-ng
parser will ignore any binding preceded by a #;
token; the binding following #;
must be begin on the same line, and
must be syntactically correct, but will otherwise be ignored.
This is a significant convenience for use cases like the event source
example above: for example one could disable chicago-service-center
by putting #;
before the name. One can also use this as a slightly
restricted means of block comments, by writing #; comment {
(the
name doesn't matter) to begin the block comment, and a matching }
to end the comment. Of course, the intervening bindings must be
syntactically correct, so this isn't an exact substitute for block
comments.
Also, configurator-ng
also allows group names to be inlined into other
group and key names, separated by a dot character. For example, these
configuration snippets are all equivalent:
foo {
bar {
x = "Hello"
y = "World"
}
}
foo.bar {
x = "Hello"
y = "World"
}
foo {
bar.x = "Hello"
bar.y = "World"
}
foo.bar.x = "Hello"
foo.bar.y = "World"
With the original configurator
, only the first snippet is
syntactically legal.
Finally, configurator-ng
supports scientific notation for numerical
values, via the
scientific
package, which corresponds closely to typical floating point syntax.
Configuration Change Subscriptions
Configurator
's change notification system is also painful to
use except in the most trivial of cases, not least because
the callback is called for a single changed (key,value)
pair at a
time. Determining how that impacts a given configuration record (like
EventSource
above) is up to the user.
Soon, configurator-ng will offer something along the lines of the
following function:
subscribe :: ConfigParser m => ConfigCache -> m a -> (a -> IO ()) -> IO ()
When the configuration files are reloaded, every subscribed
ConfigParser
is rerun, and the result is passed on to the
callback. Now, of course, many callbacks won't want to be called
unless their configuration actually changes. However, this is
actually a reasonable thing to punt to the callback, because
we can write a generic callback wrapper to handle this issue:
debounce :: (a -> a -> Bool) -> (a -> IO ()) -> IO (a -> IO ())
debounce notEq callback = do
last_seen <- newIORef Nothing
return $ \new -> do
m_old <- readIORef last_seen
if case m_old of
Nothing -> True
Just old -> notEq old new
then do
writeIORef last_seen (Just new)
callback new
else do
return ()
Optimizing subscribe
It would be more efficient to run only those ConfigParser
s that have
the possibility of changing. If we design the configurator-ng
interface carefully, we can determine all the keys that a parser
depends on. We can then use this information to rerun only those
parsers whose result might possibly change. (Though, debounce
could
still be useful, as ConfigParser
s aren't guaranteed to be 1-1
functions.)
However, once we have dependency tracking that works, there are
further applications this could enable, such as:
- providing tools to sysadmins to understand which parts of
the configuration files affect which parts of the system.
- finding values in configuration files that have no effect at all
- more speculatively, using this information to generate sample
configuration files.