hs-conllu-0.1.5: Conllu validating parser and utils.
Copyright© 2018 bruno cuconato
LicenseLPGL-3
Maintainerbruno cuconato <bcclaro+hackage@gmail.com>
Stabilityexperimental
Portabilitynon-portable
Safe HaskellSafe-Inferred
LanguageHaskell2010

Conllu.Type

Description

defines types for handling CoNLL-U data.

Synopsis

type and data declarations

Documents and Sentences

type Doc = [Sent] Source #

data Sent Source #

Constructors

Sent 

Fields

Instances

Instances details
Eq Sent Source # 
Instance details

Defined in Conllu.Type

Methods

(==) :: Sent -> Sent -> Bool #

(/=) :: Sent -> Sent -> Bool #

Show Sent Source # 
Instance details

Defined in Conllu.Type

Methods

showsPrec :: Int -> Sent -> ShowS #

show :: Sent -> String #

showList :: [Sent] -> ShowS #

type Comment = StringPair Source #

most comments are (key, value) pairs.

Words

data CW a Source #

represents a word line in a CoNLL-U file. note that we have collapsed some fields together: HEAD and DEPREL have been combined as a relation type Rel accessible by the $sel:_rel:CW function; the DEPS field is merely a list of Rel.

a C(oNLL-U)W(ord) may be a simple word, a multi-word token, or an empty node. this is captured by the phantom type (the a in the declaration), which can be parametrized by one of the data types below in order to build functions that only operate on one of these word types (see mkSWord on how to do this). see the _dep function, which only operates on simple words, which are the ones that have a DEPREL field.

Constructors

CW 

Fields

Instances

Instances details
Eq (CW a) Source # 
Instance details

Defined in Conllu.Type

Methods

(==) :: CW a -> CW a -> Bool #

(/=) :: CW a -> CW a -> Bool #

Ord (CW a) Source # 
Instance details

Defined in Conllu.Type

Methods

compare :: CW a -> CW a -> Ordering #

(<) :: CW a -> CW a -> Bool #

(<=) :: CW a -> CW a -> Bool #

(>) :: CW a -> CW a -> Bool #

(>=) :: CW a -> CW a -> Bool #

max :: CW a -> CW a -> CW a #

min :: CW a -> CW a -> CW a #

Show (CW a) Source # 
Instance details

Defined in Conllu.Type

Methods

showsPrec :: Int -> CW a -> ShowS #

show :: CW a -> String #

showList :: [CW a] -> ShowS #

Word types

data AW Source #

phantom type for any kind of word.

data SW Source #

phantom type for a simple word.

data MT Source #

phantom type for multiword tokens. do note that in MWTs only the ID, FORM and MISC fields may be non-empty.

data EN Source #

phantom type for an empty node.

Word Fields

data ID Source #

Constructors

SID Index

word ID is an integer

MID Index Index

multi-word token ID is a range

EID Index Index

empty node ID is a decimal

Instances

Instances details
Eq ID Source # 
Instance details

Defined in Conllu.Type

Methods

(==) :: ID -> ID -> Bool #

(/=) :: ID -> ID -> Bool #

Ord ID Source # 
Instance details

Defined in Conllu.Type

Methods

compare :: ID -> ID -> Ordering #

(<) :: ID -> ID -> Bool #

(<=) :: ID -> ID -> Bool #

(>) :: ID -> ID -> Bool #

(>=) :: ID -> ID -> Bool #

max :: ID -> ID -> ID #

min :: ID -> ID -> ID #

Show ID Source # 
Instance details

Defined in Conllu.Type

Methods

showsPrec :: Int -> ID -> ShowS #

show :: ID -> String #

showList :: [ID] -> ShowS #

type FEATS = [Feat] Source #

type HEAD = ID Source #

type DEPS = [Rel] Source #

data Feat Source #

feature representation

Constructors

Feat 

Fields

Instances

Instances details
Eq Feat Source # 
Instance details

Defined in Conllu.Type

Methods

(==) :: Feat -> Feat -> Bool #

(/=) :: Feat -> Feat -> Bool #

Show Feat Source # 
Instance details

Defined in Conllu.Type

Methods

showsPrec :: Int -> Feat -> ShowS #

show :: Feat -> String #

showList :: [Feat] -> ShowS #

data Rel Source #

dependency relation representation.

Constructors

Rel 

Fields

Instances

Instances details
Eq Rel Source # 
Instance details

Defined in Conllu.Type

Methods

(==) :: Rel -> Rel -> Bool #

(/=) :: Rel -> Rel -> Bool #

Show Rel Source # 
Instance details

Defined in Conllu.Type

Methods

showsPrec :: Int -> Rel -> ShowS #

show :: Rel -> String #

showList :: [Rel] -> ShowS #

type Index = Int Source #

type IxSep = Char Source #

ID separator in meta words

accessor functions

_dep :: CW SW -> Maybe EP Source #

get DEPREL main value, if it exists.

depIs :: EP -> CW SW -> Bool Source #

check if DEP is the one provided.

constructor functions

mkDEP :: String -> EP Source #

read a main DEPREL (no subtype).

mkUPOS :: String -> POS Source #

read an UPOS tag.

mkAW :: ID -> FORM -> LEMMA -> UPOS -> XPOS -> FEATS -> Maybe Rel -> DEPS -> MISC -> CW AW Source #

make a word from its fields, by default it has phantom type of AW (any kind of word).

mkSW :: CW AW -> CW SW Source #

coerce a word to a simple word.