nerf-0.5.4.1: Nerf, a named entity recognition tool based on linear-chain CRFs

Safe HaskellNone
LanguageHaskell2010

NLP.Nerf.Schema

Contents

Description

Observation schema blocks for Nerf.

Synopsis

Types

type Ox a = Ox Word Text a Source #

The Ox monad specialized to word token type and text observations.

type Schema a = Vector Word -> Int -> Ox a Source #

A schema is a block of the Ox computation performed within the context of the sentence and the absolute sentence position.

void :: a -> Schema a Source #

A dummy schema block.

sequenceS_ :: [Vector Word -> a -> Ox b] -> Vector Word -> a -> Ox () Source #

Sequence the list of schemas (or blocks) and discard individual values.

Usage

schematize :: Schema a -> [Word] -> Sent Ob Source #

Use the schema to extract observations from the sentence.

Configuration

data Body a Source #

Body of configuration entry.

Constructors

Body 

Fields

  • range :: [Int]

    Range argument for the schema block.

  • args :: a

    Additional arguments for the schema block.

Instances
Show a => Show (Body a) Source # 
Instance details

Defined in NLP.Nerf.Schema

Methods

showsPrec :: Int -> Body a -> ShowS #

show :: Body a -> String #

showList :: [Body a] -> ShowS #

Binary a => Binary (Body a) Source # 
Instance details

Defined in NLP.Nerf.Schema

Methods

put :: Body a -> Put #

get :: Get (Body a) #

putList :: [Body a] -> Put #

type Entry a = Maybe (Body a) Source #

Maybe entry.

entry :: [Int] -> Entry () Source #

Plain entry with no additional arugments.

entryWith :: a -> [Int] -> Entry a Source #

Entry with additional arguemnts.

data SchemaConf Source #

Configuration of the schema. All configuration elements specify the range over which a particular observation type should be taken on account. For example, the [-1, 0, 2] range means that observations of particular type will be extracted with respect to previous (k - 1), current (k) and after the next (k + 2) positions when identifying the observation set for position k in the input sentence.

Constructors

SchemaConf 

Fields

Instances
Show SchemaConf Source # 
Instance details

Defined in NLP.Nerf.Schema

Binary SchemaConf Source # 
Instance details

Defined in NLP.Nerf.Schema

nullConf :: SchemaConf Source #

Null configuration of the observation schema.

defaultConf Source #

Arguments

:: [Dict]

Named Entity dictionaries

-> Maybe Dict

Dictionary of internal triggers

-> Maybe Dict

Dictionary of external triggers

-> IO SchemaConf 

Default configuration of the observation schema.

fromConf :: SchemaConf -> Schema () Source #

Build the schema based on the configuration.

Schema blocks

type Block a = Vector Word -> [Int] -> Ox a Source #

A block is a chunk of the Ox computation performed within the context of the sentence and the list of absolute sentence positions.

fromBlock :: Block a -> [Int] -> Schema a Source #

Transform the block to the schema depending on the list of relative sentence positions.

orthB :: Block () Source #

Orthographic form at the current position.

splitOrthB :: Block () Source #

Orthographic form split into two observations: the lowercased form and the original form (only when different than the lowercased one).

lowPrefixesB :: [Int] -> Block () Source #

List of lowercased prefixes of given lengths.

lowSuffixesB :: [Int] -> Block () Source #

List of lowercased suffixes of given lengths.

lemmaB :: Int -> Block () Source #

Lemma substitute parametrized by the number specifying the span over which lowercased prefixes and suffixes will be saved. For example, lemmaB 2 will take affixes of lengths 0, -1 and -2 on account.

shapeB :: Block () Source #

Shape of the word.

packedB :: Block () Source #

Packed shape of the word.

shapePairB :: Block () Source #

Combined shapes of two consecutive (at k-1 and k positions) words.

packedPairB :: Block () Source #

Combined packed shapes of two consecutive (at k-1 and k positions) words.

dictB :: Dict -> Block () Source #

Plain dictionary search determined with respect to the list of relative positions.