brassica-0.2.0: Featureful sound change applier
Safe HaskellSafe-Inferred
LanguageHaskell2010

Brassica.MDF

Description

This module contains types and functions for working with the MDF dictionary format, used by programs such as SIL Toolbox. For more on the MDF format, refer to e.g. Coward & Grimes (2000), Making Dictionaries: A guide to lexicography and the Multi-Dictionary Formatter.

Synopsis

MDF files

newtype MDF v Source #

An MDF (Multi-Dictionary Formatter) file, represented as a list of (field marker, whitespace, field value) tuples. The field marker is represented excluding its initial slash; whitespace after the field marker is also stored, allowing the original MDF file to be precisely recovered. Field values should includes all whitespace to the next marker. All field values are stored as Strings, with the exception of Vernacular fields, which have type v.

For instance, the following MDF file:

\lx kapa
\ps n
\ge parent
\se sakapa
\ge father

Could be stored as:

MDF [ ("lx", " ", Right "kapa\n")
    , ("ps", " ", Left "n\n")
    , ("ge", " ", Left "parent\n")
    , ("se", " ", Right "sakapa\n")
    , ("ge", " ", Left "father")
    ]

Constructors

MDF 

Fields

Instances

Instances details
Functor MDF Source # 
Instance details

Defined in Brassica.MDF

Methods

fmap :: (a -> b) -> MDF a -> MDF b #

(<$) :: a -> MDF b -> MDF a #

Show v => Show (MDF v) Source # 
Instance details

Defined in Brassica.MDF

Methods

showsPrec :: Int -> MDF v -> ShowS #

show :: MDF v -> String #

showList :: [MDF v] -> ShowS #

data MDFLanguage Source #

The designated language of an MDF field.

Instances

Instances details
Show MDFLanguage Source # 
Instance details

Defined in Brassica.MDF

Eq MDFLanguage Source # 
Instance details

Defined in Brassica.MDF

fieldLangs :: Map String MDFLanguage Source #

A Map from the most common field markers to the language of their values.

(Note: This is currently hardcoded in the source code, based on the values in the MDF definitions from SIL Toolbox. There’s probably a more principled way of defining this, but hardcoding should suffice for now.)

Parsing

parseMDFRaw :: String -> Either (ParseErrorBundle String Void) (MDF String) Source #

Parse an MDF file to an MDF, storing the Vernacular fields as Strings.

parseMDFWithTokenisation :: [String] -> String -> Either (ParseErrorBundle String Void) (MDF [Component PWord]) Source #

Parse an MDF file to an MDF, parsing the Vernacular fields into Components in the process.

Re-export

errorBundlePretty #

Arguments

:: (VisualStream s, TraversableStream s, ShowErrorComponent e) 
=> ParseErrorBundle s e

Parse error bundle to display

-> String

Textual rendition of the bundle

Pretty-print a ParseErrorBundle. All ParseErrors in the bundle will be pretty-printed in order together with the corresponding offending lines by doing a single pass over the input stream. The rendered String always ends with a newline.

Since: megaparsec-7.0.0

Conversion

componentiseMDF :: MDF [Component a] -> [Component a] Source #

Convert an MDF to a list of Components representing the same textual content. Vernacular field values are left as is; everything else is treated as a Separator, so that it is not disturbed by operations such as rule application or rendering to text.

componentiseMDFWordsOnly :: MDF [Component a] -> [Component a] Source #

As with componentiseMDF, but the resulting Components contain the contents of Vernacular fields only; all else is discarded. The first parameter specifies the Separator to insert after each vernacular field.

duplicateEtymologies Source #

Arguments

:: (v -> String)

Function to convert from vernacular field values to strings. Can also be used to preprocess the value of the resulting et fields, e.g. by prepending * or similar.

-> MDF v 
-> MDF v 

Add etymological fields to an MDF by duplicating the values in lx, se and ge fields. e.g.:

\lx kapa
\ps n
\ge parent
\se sakapa
\ge father

Would become:

\lx kapa
\ps n
\ge parent
\et kapa
\eg parent
\se sakapa
\ge father
\et sakapa
\eg father

This can be helpful when applying sound changes to an MDF file: the vernacular words can be copied as etymologies, and then the sound changes can be applied leaving the etymologies as is.