superdoc- Additional documentation markup and Unicode support

Safe HaskellSafe




This module provides conversions between various different markup formats. In principle, it provides two different conversions:

  1. Converting the Superdoc markup language to HTML.
  2. Converting Unicode streams to HTML.

Conversion 1 is exposed by the function markup. This is used by the post-Haddock hook.

Conversion 2 is provided by the to_html function. It is used by the post-HsColour hook.


Format definitions

The Superdoc markup language provides tags for superscripts, subscripts, and more. The following markup is recognized:


  • [super text]: superscript.
  • [sup text]: superscript. A synonym for [super text].
  • [sub text]: subscript.
  • [exp text]: exponential function.
  • [bold text]: bold.
  • [center text]: centered.
  • [nobr text]: inhibit line breaks.
  • [image filename]: insert image.
  • [uni nnnn]: Unicode character.
  • [literal text]: literal text. Brackets '[' and ']' may only occur in nested pairs.


type Filter a = String -> (String, a) Source #

A filter is basically a function from strings to strings. Ideally a filter is lazy, so that the input string is consumed incrementally; however, this is not strictly necessary. A filter may also return another result in addition to a string.

filter_id :: Filter () Source #

The identity filter.

filter_handles :: Filter a -> Handle -> Handle -> IO a Source #

Run a filter by reading from one handle and writing to another. The handles are set to binary mode.

filter_file :: Filter a -> FilePath -> FilePath -> IO a Source #

Run a filter by reading from a file and writing to another file. We do not assume that the two files are necessarily distinct, so special care is taken not to overwrite the output file until after the input file has been read.

filter_files :: Filter a -> [FilePath] -> IO [a] Source #

Run a filter on a number of files, overwriting each file in place.

Markup parser

This section defines a simple grammar and parser for the Superdoc markup language, translating it to HTML. This is used to post-process Haddock's output.

Top-level function

markup :: Filter (Set FilePath) Source #

The top-level parser for Superdoc markup, expressed as a filter. In addition to producing HTML output, this filter also returns the set of all image files that were linked to.

Grammar definition

markup_top :: ReadP (String, Set FilePath) Source #

Top-level parser for Superdoc markup.

top ::= (other | tag | char)*.

lift :: ReadP String -> ReadP (String, Set FilePath) Source #

Lift a parser returning a string to a parser returning a string and an empty set.

markup_nested :: ReadP (String, Set FilePath) Source #

Like markup, but only permit brackets "[" and "]" to occur in nested pairs.

nested ::= (other | tag | bracketed | underscore)*.

markup_bracketed :: ReadP (String, Set FilePath) Source #

Parse bracketed text.

bracketed ::= "[" nested "]".

markup_nonbracket :: ReadP String Source #

Parse any single character except '[' and ']'.

nonbracket ::= any character besides '[', ']'.

markup_other :: ReadP String Source #

Parse any sequence of non-special characters: anything but '[' and ']'.

other ::= (any character besides '[', ']')+.

markup_char :: ReadP String Source #

Parse any one character.

char ::= any character.

markup_tag :: ReadP (String, Set FilePath) Source #

Parse a tag.

tag ::= "[" keyword body "]".

markup_keyword :: ReadP String Source #

Parse a keyword.

keyword ::= "sup" | "super" | "sub" | "exp" | "bold" | "center" | "nobr" | "image" | "uni" | "literal".

markup_literal :: ReadP String Source #

Parse any text with balanced brackets.

literal ::= (nonbracket | bracketed_literal)*.

markup_bracketed_literal :: ReadP String Source #

Parse any bracketed text with balanced brackets.

bracketed_literal ::= "[" literal "]".

markup_body :: String -> ReadP (String, Set FilePath) Source #

Parse a tag's body. What to do depends on the tag name.

body ::= nested (for keyword = sup, super, sub, exp, bold, center, nobr),

body ::= filename (for keyword = image),

body ::= digit+ (for keyword = uni).

body ::= literal (for keyword = literal).

Unicode to HTML conversion

to_html :: [Token] -> String Source #

Convert a tokenized Unicode stream into HTML entities. Non-ASCII characters are represented as HTML entities of the form &#nnnn;. Any invalid characters are simply copied to the output.