hs-conllu-0.1.5: Conllu validating parser and utils.
Copyright© 2018 bruno cuconato
LicenseLPGL-3
Maintainerbruno cuconato <bcclaro+hackage@gmail.com>
Stabilityexperimental
Portabilitynon-portable
Safe HaskellSafe-Inferred
LanguageHaskell2010

Conllu.Diff

Description

Build a diff of CoNLL-U elements (documents, sentences, words). it may show the diff (the print* functions return pairs of the differing fields in two words) or return the word pairs for further processing (the diff* functions). it expects paired sentences as input, and a default pairing function is provided.

this module is useful for visualizing or debugging the processing of CoNLL-U corpora. be sure that the sentences are well-paired, or else it'll be -- as always -- garbage in, garbage out.

Synopsis

type synonims

type FDiff = StringPair Source #

CoNLL-U field diff.

type WDiff a = (CW a, CW a) Source #

pair of different words.

type SDiff a = [WDiff a] Source #

list of different words in a sentence.

type DDiff a = [SDiff a] Source #

list of lists of different words in sentences.

diffing functions

diffW :: WDiff a -> Bool Source #

True if any word field pairs are mismatched.

diffWs :: [CW a] -> [CW a] -> [WDiff a] Source #

filters the different word pairs.

diffS :: (Sent, Sent) -> SDiff AW Source #

diffs the sentence pair's words.

diffSs :: [(Sent, Sent)] -> DDiff AW Source #

diffs the sentence pairs.

auxiliary functions

showM :: Show a => Maybe a -> String Source #

shows a word field.

pairing functions

pairSentsBy :: (Sent -> Sent -> Ordering) -> [Sent] -> [Sent] -> [(Sent, Sent)] Source #

pairs sentences by some ordering of Sent.

sentId :: Sent -> Maybe Index Source #

try to find an index in a sentence's metadata looking for 'sent_id = n'.

pairSents :: [Sent] -> [Sent] -> [(Sent, Sent)] Source #

pair sentences by their sent_id, found in their metadata.

printing functions

printFieldDiffs :: WDiff a -> [Maybe StringPair] Source #

list of maybe differing fields in a pair of words.

printWDiff :: WDiff a -> [StringPair] Source #

list of differing fields in a pair of words.

printSDiff :: SDiff a -> [[StringPair]] Source #

list of differing words in a sentence.

printDDiff :: DDiff a -> [[[StringPair]]] Source #

list of lists of differing words in sentences.