Safe Haskell | Safe-Inferred |
---|---|
Language | Haskell2010 |
Synopsis
- decode :: SExprParser atom carrier -> Text -> Either String [carrier]
- decodeOne :: SExprParser atom carrier -> Text -> Either String carrier
- encode :: SExprPrinter atom carrier -> [carrier] -> Text
- encodeOne :: SExprPrinter atom carrier -> carrier -> Text
- data SExprParser atom carrier
- type Reader atom = Parser (SExpr atom) -> Parser (SExpr atom)
- type Comment = Parser ()
- mkParser :: Parser atom -> SExprParser atom (SExpr atom)
- setCarrier :: (b -> Either String c) -> SExprParser a b -> SExprParser a c
- addReader :: Char -> Reader a -> SExprParser a c -> SExprParser a c
- setComment :: Comment -> SExprParser a c -> SExprParser a c
- asRich :: SExprParser a (SExpr b) -> SExprParser a (RichSExpr b)
- asWellFormed :: SExprParser a (SExpr b) -> SExprParser a (WellFormedSExpr b)
- withQuote :: IsString t => SExprParser t (SExpr t) -> SExprParser t (SExpr t)
- data SExprPrinter atom carrier
- data Indent
- = Swing
- | SwingAfter Int
- | Align
- basicPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom)
- flatPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom)
- unconstrainedPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom)
- setFromCarrier :: (c -> b) -> SExprPrinter a b -> SExprPrinter a c
- setMaxWidth :: Int -> SExprPrinter atom carrier -> SExprPrinter atom carrier
- removeMaxWidth :: SExprPrinter atom carrier -> SExprPrinter atom carrier
- setIndentAmount :: Int -> SExprPrinter atom carrier -> SExprPrinter atom carrier
- setIndentStrategy :: (SExpr atom -> Indent) -> SExprPrinter atom carrier -> SExprPrinter atom carrier
SCargot Basics
The S-Cargot library is a library for parsing and emitting s-expressions, designed to be as flexible as possible. Despite some efforts at standardization, s-expressions are a general approach to describing a data format that can very often differ in subtle, incompatible ways: the s-expressions understood by Common Lisp are different from the s-expressions understood by Scheme, and even the different revisions of the Scheme language understand s-expressions in a slightly different way. To accomodate this, the S-Cargot library provides a toolbox for defining variations on s-expressions, complete with the ability to select various comment syntaxes, reader macros, and atom types.
If all you want is to read some s-expressions and don't care about the edge cases of the format, or all you want is a new configuration format, try the Data.SCargot.Language.Basic or Data.SCargot.Language.HaskLike modules, which define an s-expression language whose atoms are plain strings and Haskell literals, respectively.
The S-Cargot library works by specifying values which contain all
the information needed to either parse or print an s-expression.
The actual s-expression structure is parsed as a structure of
cons cells as represented
by the SExpr
type, but can alternately be exposed as the
isomorphic RichSExpr
type or the less expressive but
easier-to-work-with WellFormedSExpr
type. Modules devoted
to each representation type (in Data.SCargot.Repr.Basic,
Data.SCargot.Repr.Rich, and Data.SCargot.Repr.WellFormed)
provide helper functions, lenses, and pattern synonyms to make
creating and processing these values easier.
The details of how to parse a given structure are represented
by building up a SExprParser
value, which is defined in
Data.SCargot.Parse and re-exported here. A minimal
SExprParser
defines only how to parse the atoms of the
language; helper functions can define comment syntaxes,
reader macros, and transformations over the parsed structure.
The details of how to print a given structure are represented
by building up a SExprPrinter
value, which is defined in
Data.SCargot.Print and re-exported here. A minimal
SExprPrinter
defines only how to print the atoms of the
language; helper functions help with the layout of the
pretty-printed s-expression in terms of how to indent the
surrounding expression.
Other helper modules define useful primitives for building up s-expression languages: the Data.SCargot.Common module provides parsers for common literals, while the Data.SCargot.Comments module provides parsers for comment syntaxes borrowed from various other languages.
Parsing and Printing
decode :: SExprParser atom carrier -> Text -> Either String [carrier] Source #
Decode several S-expressions according to a given SExprParser
. This
will return a list of every S-expression that appears at the top-level
of the document.
decodeOne :: SExprParser atom carrier -> Text -> Either String carrier Source #
Decode a single S-expression. If any trailing input is left after
the S-expression (ignoring comments or whitespace) then this
will fail: for those cases, use decode
, which returns a list of
all the S-expressions found at the top level.
encode :: SExprPrinter atom carrier -> [carrier] -> Text Source #
Turn a list of s-expressions into a single string according to
a given SExprPrinter
.
encodeOne :: SExprPrinter atom carrier -> carrier -> Text Source #
Turn a single s-expression into a string according to a given
SExprPrinter
.
Parser Construction
Specifying a Parser
data SExprParser atom carrier Source #
A SExprParser
describes a parser for a particular value
that has been serialized as an s-expression. The atom
parameter
corresponds to a Haskell type used to represent the atoms,
and the carrier
parameter corresponds to the parsed S-Expression
structure.
type Reader atom = Parser (SExpr atom) -> Parser (SExpr atom) Source #
A Reader
represents a reader macro: it takes a parser for
the S-Expression type and performs as much or as little
parsing as it would like, and then returns an S-expression.
type Comment = Parser () Source #
A Comment
represents any kind of skippable comment. This
parser must be able to fail if a comment is not being
recognized, and it must not consume any input in case
of failure.
mkParser :: Parser atom -> SExprParser atom (SExpr atom) Source #
Create a basic SExprParser
when given a parser
for an atom type.
>>>
import Text.Parsec (alphaNum, many1)
>>>
let parser = mkParser (many1 alphaNum)
>>>
decode parser "(ele phant)"
Right [SCons (SAtom "ele") (SCons (SAtom "phant") SNil)]
setCarrier :: (b -> Either String c) -> SExprParser a b -> SExprParser a c Source #
Modify the carrier type for a SExprParser
. This is
used internally to convert between various SExpr
representations,
but could also be used externally to add an extra conversion layer
onto a SExprParser
.
>>>
import Text.Parsec (alphaNum, many1)
>>>
import Data.SCargot.Repr (toRich)
>>>
let parser = setCarrier (return . toRich) (mkParser (many1 alphaNum))
>>>
decode parser "(ele phant)"
Right [RSlist [RSAtom "ele",RSAtom "phant"]]
addReader :: Char -> Reader a -> SExprParser a c -> SExprParser a c Source #
Add the ability to execute some particular reader macro, as
defined by its initial character and the Parser
which returns
the parsed S-Expression. The Reader
is passed a Parser
which
can be recursively called to parse more S-Expressions, and begins
parsing after the reader character has been removed from the
stream.
>>>
import Text.Parsec (alphaNum, char, many1)
>>>
let vecReader p = (char ']' *> pure SNil) <|> (SCons <$> p <*> vecReader p)
>>>
let parser = addReader '[' vecReader (mkParser (many1 alphaNum))
>>>
decode parser "(an [ele phant])"
Right [SCons (SAtom "an") (SCons (SCons (SAtom "ele") (SCons (SAtom "phant") SNil)) SNil)]
setComment :: Comment -> SExprParser a c -> SExprParser a c Source #
Add the ability to ignore some kind of comment. This gets
factored into whitespace parsing, and it's very important that
the parser supplied be able to fail (as otherwise it will
cause an infinite loop), and also that it not consume any input
(which may require it to be wrapped in try
.)
>>>
import Text.Parsec (alphaNum, anyChar, manyTill, many1, string)
>>>
let comment = string "//" *> manyTill anyChar newline *> pure ()
>>>
let parser = setComment comment (mkParser (many1 alphaNum))
>>>
decode parser "(ele //a comment\n phant)"
Right [SCons (SAtom "ele") (SCons (SAtom "phant") SNil)]
asRich :: SExprParser a (SExpr b) -> SExprParser a (RichSExpr b) Source #
asWellFormed :: SExprParser a (SExpr b) -> SExprParser a (WellFormedSExpr b) Source #
Convert the final output representation from the SExpr
type
to the WellFormedSExpr
type.
>>>
import Text.Parsec (alphaNum, many1)
>>>
let parser = asWellFormed (mkParser (many1 alphaNum))
>>>
decode parser "(ele phant)"
Right [WFSList [WFSAtom "ele",WFSAtom "phant"]]
withQuote :: IsString t => SExprParser t (SExpr t) -> SExprParser t (SExpr t) Source #
Add the ability to understand a quoted S-Expression.
Many Lisps use 'sexpr
as sugar for (quote sexpr)
. This
assumes that the underlying atom type implements the IsString
class, and will create the quote
atom using fromString "quote"
.
>>>
import Text.Parsec (alphaNum, many1)
>>>
let parser = withQuote (mkParser (many1 alphaNum))
>>>
decode parser "'elephant"
Right [SCons (SAtom "quote") (SCons (SAtom "foo") SNil)]
Printer Construction
Specifying a Pretty-Printer
data SExprPrinter atom carrier Source #
A SExprPrinter
value describes how to print a given value as an
s-expression. The carrier
type parameter indicates the value
that will be printed, and the atom
parameter indicates the type
that will represent tokens in an s-expression structure.
The Indent
type is used to determine how to indent subsequent
s-expressions in a list, after printing the head of the list. This only
applies if the entire list cannot be printed within the allowable
maxWidth
; a sub-maxWidth printing will not add newlines or
indentation.
Swing | A (foo bar baz quux) Any |
SwingAfter Int | A (foo bar baz quux) The |
Align | An (foo bar baz quux) |
basicPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom) Source #
A default SExprPrinter
struct that will always swing subsequent
expressions onto later lines if they're too long, indenting them
by two spaces, and uses a soft maximum width of 80 characters
flatPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom) Source #
A default SExprPrinter
struct that will always print a SExpr
as a single line.
unconstrainedPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom) Source #
A default SExprPrinter
struct that will always swing subsequent
expressions onto later lines if they're too long, indenting them by
two spaces, but makes no effort to keep the pretty-printed sources
inside a maximum width. In the case that we want indented printing
but don't care about a "maximum" width, we can print more
efficiently than in other situations.
setFromCarrier :: (c -> b) -> SExprPrinter a b -> SExprPrinter a c Source #
Modify the carrier type of a SExprPrinter
by describing how
to convert the new type back to the previous type. For example,
to pretty-print a well-formed s-expression, we can modify the
SExprPrinter
value as follows:
>>>
let printer = setFromCarrier fromWellFormed (basicPrint id)
>>>
encodeOne printer (WFSList [WFSAtom "ele", WFSAtom "phant"])
"(ele phant)"
setMaxWidth :: Int -> SExprPrinter atom carrier -> SExprPrinter atom carrier Source #
Dictate a maximum width for pretty-printed s-expressions.
>>>
let printer = setMaxWidth 8 (basicPrint id)
>>>
encodeOne printer (L [A "one", A "two", A "three"])
"(one \n two\n three)"
removeMaxWidth :: SExprPrinter atom carrier -> SExprPrinter atom carrier Source #
Allow the serialized s-expression to be arbitrarily wide. This makes all pretty-printing happen on a single line.
>>>
let printer = removeMaxWidth (basicPrint id)
>>>
encodeOne printer (L [A "one", A "two", A "three"])
"(one two three)"
setIndentAmount :: Int -> SExprPrinter atom carrier -> SExprPrinter atom carrier Source #
Set the number of spaces that a subsequent line will be indented after a swing indentation.
>>>
let printer = setMaxWidth 12 (basicPrint id)
>>>
encodeOne printer (L [A "elephant", A "pachyderm"])
"(elephant \n pachyderm)">>>
encodeOne (setIndentAmount 4) (L [A "elephant", A "pachyderm"])
"(elephant \n pachyderm)"
setIndentStrategy :: (SExpr atom -> Indent) -> SExprPrinter atom carrier -> SExprPrinter atom carrier Source #
Dictate how to indent subsequent lines based on the leading
subexpression in an s-expression. For details on how this works,
consult the documentation of the Indent
type.
>>>
let indent (A "def") = SwingAfter 1; indent _ = Swing
>>>
let printer = setIndentStrategy indent (setMaxWidth 8 (basicPrint id))
>>>
encodeOne printer (L [ A "def", L [ A "func", A "arg" ], A "body" ])
"(def (func arg)\n body)">>>
encodeOne printer (L [ A "elephant", A "among", A "pachyderms" ])
"(elephant \n among\n pachyderms)"