s-cargot-0.1.3.0: A flexible, extensible s-expression library.

Safe HaskellNone
LanguageHaskell2010

Data.SCargot

Contents

Synopsis

SCargot Basics

The S-Cargot library is a library for parsing and emitting s-expressions, designed to be as flexible as possible. Despite some efforts at standardization, s-expressions are a general approach to describing a data format that can very often differ in subtle, incompatible ways: the s-expressions understood by Common Lisp are different from the s-expressions understood by Scheme, and even the different revisions of the Scheme language understand s-expressions in a slightly different way. To accomodate this, the S-Cargot library provides a toolbox for defining variations on s-expressions, complete with the ability to select various comment syntaxes, reader macros, and atom types.

If all you want is to read some s-expressions and don't care about the edge cases of the format, or all you want is a new configuration format, try the Data.SCargot.Language.Basic or Data.SCargot.Language.HaskLike modules, which define an s-expression language whose atoms are plain strings and Haskell literals, respectively.

The S-Cargot library works by specifying values which contain all the information needed to either parse or print an s-expression. The actual s-expression structure is parsed as a structure of cons cells as represented by the SExpr type, but can alternately be exposed as the isomorphic RichSExpr type or the less expressive but easier-to-work-with WellFormedSExpr type. Modules devoted to each representation type (in Data.SCargot.Repr.Basic, Data.SCargot.Repr.Rich, and Data.SCargot.Repr.WellFormed) provide helper functions, lenses, and pattern synonyms to make creating and processing these values easier.

The details of how to parse a given structure are represented by building up a SExprParser value, which is defined in Data.SCargot.Parse and re-exported here. A minimal SExprParser defines only how to parse the atoms of the language; helper functions can define comment syntaxes, reader macros, and transformations over the parsed structure.

The details of how to print a given structure are represented by building up a SExprPrinter value, which is defined in Data.SCargot.Print and re-exported here. A minimal SExprPrinter defines only how to print the atoms of the language; helper functions help with the layout of the pretty-printed s-expression in terms of how to indent the surrounding expression.

Other helper modules define useful primitives for building up s-expression languages: the Data.SCargot.Common module provides parsers for common literals, while the Data.SCargot.Comments module provides parsers for comment syntaxes borrowed from various other languages.

Parsing and Printing

decode :: SExprParser atom carrier -> Text -> Either String [carrier] Source #

Decode several S-expressions according to a given SExprParser. This will return a list of every S-expression that appears at the top-level of the document.

decodeOne :: SExprParser atom carrier -> Text -> Either String carrier Source #

Decode a single S-expression. If any trailing input is left after the S-expression (ignoring comments or whitespace) then this will fail: for those cases, use decode, which returns a list of all the S-expressions found at the top level.

encode :: SExprPrinter atom carrier -> [carrier] -> Text Source #

Turn a list of s-expressions into a single string according to a given SExprPrinter.

encodeOne :: SExprPrinter atom carrier -> carrier -> Text Source #

Turn a single s-expression into a string according to a given SExprPrinter.

Parser Construction

Specifying a Parser

data SExprParser atom carrier Source #

A SExprParser describes a parser for a particular value that has been serialized as an s-expression. The atom parameter corresponds to a Haskell type used to represent the atoms, and the carrier parameter corresponds to the parsed S-Expression structure.

type Reader atom = Parser (SExpr atom) -> Parser (SExpr atom) Source #

A Reader represents a reader macro: it takes a parser for the S-Expression type and performs as much or as little parsing as it would like, and then returns an S-expression.

type Comment = Parser () Source #

A Comment represents any kind of skippable comment. This parser must be able to fail if a comment is not being recognized, and it must not consume any input in case of failure.

mkParser :: Parser atom -> SExprParser atom (SExpr atom) Source #

Create a basic SExprParser when given a parser for an atom type.

>>> import Text.Parsec (alphaNum, many1)
>>> let parser = mkParser (many1 alphaNum)
>>> decode parser "(ele phant)"
Right [SCons (SAtom "ele") (SCons (SAtom "phant") SNil)]

setCarrier :: (b -> Either String c) -> SExprParser a b -> SExprParser a c Source #

Modify the carrier type for a SExprParser. This is used internally to convert between various SExpr representations, but could also be used externally to add an extra conversion layer onto a SExprParser.

>>> import Text.Parsec (alphaNum, many1)
>>> import Data.SCargot.Repr (toRich)
>>> let parser = setCarrier (return . toRich) (mkParser (many1 alphaNum))
>>> decode parser "(ele phant)"
Right [RSlist [RSAtom "ele",RSAtom "phant"]]

addReader :: Char -> Reader a -> SExprParser a c -> SExprParser a c Source #

Add the ability to execute some particular reader macro, as defined by its initial character and the Parser which returns the parsed S-Expression. The Reader is passed a Parser which can be recursively called to parse more S-Expressions, and begins parsing after the reader character has been removed from the stream.

>>> import Text.Parsec (alphaNum, char, many1)
>>> let vecReader p = (char ']' *> pure SNil) <|> (SCons <$> p <*> vecReader p)
>>> let parser = addReader '[' vecReader (mkParser (many1 alphaNum))
>>> decode parser "(an [ele phant])"
Right [SCons (SAtom "an") (SCons (SCons (SAtom "ele") (SCons (SAtom "phant") SNil)) SNil)]

setComment :: Comment -> SExprParser a c -> SExprParser a c Source #

Add the ability to ignore some kind of comment. This gets factored into whitespace parsing, and it's very important that the parser supplied be able to fail (as otherwise it will cause an infinite loop), and also that it not consume any input (which may require it to be wrapped in try.)

>>> import Text.Parsec (alphaNum, anyChar, manyTill, many1, string)
>>> let comment = string "//" *> manyTill anyChar newline *> pure ()
>>> let parser = setComment comment (mkParser (many1 alphaNum))
>>> decode parser "(ele //a comment\n  phant)"
Right [SCons (SAtom "ele") (SCons (SAtom "phant") SNil)]

asRich :: SExprParser a (SExpr b) -> SExprParser a (RichSExpr b) Source #

Convert the final output representation from the SExpr type to the RichSExpr type.

>>> import Text.Parsec (alphaNum, many1)
>>> let parser = asRich (mkParser (many1 alphaNum))
>>> decode parser "(ele phant)"
Right [RSlist [RSAtom "ele",RSAtom "phant"]]

asWellFormed :: SExprParser a (SExpr b) -> SExprParser a (WellFormedSExpr b) Source #

Convert the final output representation from the SExpr type to the WellFormedSExpr type.

>>> import Text.Parsec (alphaNum, many1)
>>> let parser = asWellFormed (mkParser (many1 alphaNum))
>>> decode parser "(ele phant)"
Right [WFSList [WFSAtom "ele",WFSAtom "phant"]]

withQuote :: IsString t => SExprParser t (SExpr t) -> SExprParser t (SExpr t) Source #

Add the ability to understand a quoted S-Expression. Many Lisps use 'sexpr as sugar for (quote sexpr). This assumes that the underlying atom type implements the IsString class, and will create the quote atom using fromString "quote".

>>> import Text.Parsec (alphaNum, many1)
>>> let parser = withQuote (mkParser (many1 alphaNum))
>>> decode parser "'elephant"
Right [SCons (SAtom "quote") (SCons (SAtom "foo") SNil)]

Printer Construction

Specifying a Pretty-Printer

data SExprPrinter atom carrier Source #

A SExprPrinter value describes how to print a given value as an s-expression. The carrier type parameter indicates the value that will be printed, and the atom parameter indicates the type that will represent tokens in an s-expression structure.

data Indent Source #

The Indent type is used to determine how to indent subsequent s-expressions in a list, after printing the head of the list.

Constructors

Swing

A Swing indent will indent subsequent exprs some fixed amount more than the current line.

(foo
  bar
  baz
  quux)
SwingAfter Int

A SwingAfter n indent will try to print the first n expressions after the head on the same line as the head, and all after will be swung. SwingAfter 0 is equivalent to Swing.

(foo bar
  baz
  quux)
Align

An Align indent will print the first expression after the head on the same line, and subsequent expressions will be aligned with that one.

(foo bar
     baz
     quux)

Instances

basicPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom) Source #

A default SExprPrinter struct that will always swing subsequent expressions onto later lines if they're too long, indenting them by two spaces, and uses a soft maximum width of 80 characters

flatPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom) Source #

A default SExprPrinter struct that will always print a SExpr as a single line.

unconstrainedPrint :: (atom -> Text) -> SExprPrinter atom (SExpr atom) Source #

A default SExprPrinter struct that will always swing subsequent expressions onto later lines if they're too long, indenting them by two spaces, but makes no effort to keep the pretty-printed sources inside a maximum width. In the case that we want indented printing but don't care about a "maximum" width, we can print more efficiently than in other situations.

setFromCarrier :: (c -> b) -> SExprPrinter a b -> SExprPrinter a c Source #

Modify the carrier type of a SExprPrinter by describing how to convert the new type back to the previous type. For example, to pretty-print a well-formed s-expression, we can modify the SExprPrinter value as follows:

>>> let printer = setFromCarrier fromWellFormed (basicPrint id)
>>> encodeOne printer (WFSList [WFSAtom "ele", WFSAtom "phant"])
"(ele phant)"

setMaxWidth :: Int -> SExprPrinter atom carrier -> SExprPrinter atom carrier Source #

Dictate a maximum width for pretty-printed s-expressions.

>>> let printer = setMaxWidth 8 (basicPrint id)
>>> encodeOne printer (L [A "one", A "two", A "three"])
"(one \n  two\n  three)"

removeMaxWidth :: SExprPrinter atom carrier -> SExprPrinter atom carrier Source #

Allow the serialized s-expression to be arbitrarily wide. This makes all pretty-printing happen on a single line.

>>> let printer = removeMaxWidth (basicPrint id)
>>> encodeOne printer (L [A "one", A "two", A "three"])
"(one two three)"

setIndentAmount :: Int -> SExprPrinter atom carrier -> SExprPrinter atom carrier Source #

Set the number of spaces that a subsequent line will be indented after a swing indentation.

>>> let printer = setMaxWidth 12 (basicPrint id)
>>> encodeOne printer (L [A "elephant", A "pachyderm"])
"(elephant \n  pachyderm)"
>>> encodeOne (setIndentAmount 4) (L [A "elephant", A "pachyderm"])
"(elephant \n    pachyderm)"

setIndentStrategy :: (SExpr atom -> Indent) -> SExprPrinter atom carrier -> SExprPrinter atom carrier Source #

Dictate how to indent subsequent lines based on the leading subexpression in an s-expression. For details on how this works, consult the documentation of the Indent type.

>>> let indent (A "def") = SwingAfter 1; indent _ = Swing
>>> let printer = setIndentStrategy indent (setMaxWidth 8 (basicPrint id))
>>> encodeOne printer (L [ A "def", L [ A "func", A "arg" ], A "body" ])
"(def (func arg)\n  body)"
>>> encodeOne printer (L [ A "elephant", A "among", A "pachyderms" ])
"(elephant \n  among\n  pachyderms)"