concraft-0.11.0: Morphological disambiguation based on constrained CRFs

Safe HaskellNone
LanguageHaskell98

NLP.Concraft.DAGSeg

Contents

Description

Top-level module adated to DAGs, guessing and disambiguation.

Synopsis

Model

data Concraft t Source #

Concraft data.

Constructors

Concraft 

saveModel :: (Ord t, Binary t) => FilePath -> Concraft t -> IO () Source #

Save model in a file. Data is compressed using the gzip format.

loadModel Source #

Arguments

:: (Ord t, Binary t) 
=> (Tagset -> t -> Tag)

Guesser simplification function

-> (Tagset -> t -> Tag)

Disamb simplification function

-> FilePath 
-> IO (Concraft t) 

Load model from a file.

Annotation

type Anno a b = DAG () (Map a b) Source #

DAG annotation, assignes b values to a labels for each edge in the graph.

Best paths

findOptimalPaths :: Ord t => Anno t Double -> [[(EdgeID, Set t)]] Source #

Find all optimal paths in the given annotation. Optimal paths are those which go through tags with the assigned probability 1. For a given chosen edge, all the tags with probability 1 are selected.

disambPath :: Ord t => [(EdgeID, Set t)] -> Anno t Double -> Anno t Bool Source #

Make the given path with disamb markers in the given annotation and produce a new disamb annotation.

Marginals

guessMarginals :: (Word w, Ord t) => Guesser t Tag -> Sent w t -> Anno t Double Source #

Determine marginal probabilities corresponding to individual tags w.r.t. the guessing model.

disambMarginals :: (Word w, Ord t) => Disamb t -> Sent w t -> Anno t Double Source #

Determine marginal probabilities corresponding to individual tags w.r.t. the guessing model.

disambProbs :: (Word w, Ord t) => ProbType -> Disamb t -> Sent w t -> Anno t Double Source #

Determine probabilities corresponding to individual tags w.r.t. the guessing model.

Tagging

guessSent :: (Word w, Ord t) => Int -> Guesser t Tag -> Sent w t -> Sent w t Source #

Extend the OOV words with new, guessed interpretations.

Determine marginal probabilities corresponding to individual tags w.r.t. the guessing model and, afterwards, trim the sentence to keep only the k most probably labels for each OOV edge. Note that, for OOV words, the entire set of default tags is considered.

guess :: (Word w, Ord t) => Int -> Guesser t Tag -> Sent w t -> Anno t Double Source #

Perform guessing, trimming, and finally determine marginal probabilities corresponding to individual tags w.r.t. the guessing model.

tag :: (Word w, Ord t) => Int -> Concraft t -> Sent w t -> Anno t Double Source #

Perform guessing, trimming, and finally determine marginal probabilities corresponding to individual tags w.r.t. the disambiguation model.

Pruning

prune :: Double -> Concraft t -> Concraft t Source #

Prune the disambiguation model: discard model features with absolute values (in log-domain) lower than the given threshold.