name: chatter
version: 0.6.0.0
synopsis: A library of simple NLP algorithms.
description: chatter is a collection of simple Natural Language
Processing algorithms.
.
Chatter supports:
.
* Part of speech tagging with Averaged
Perceptrons. Based on the Python implementation
by Matthew Honnibal:
() See 'NLP.POS' for the details of part-of-speech tagging with chatter.
.
* Phrasal Chunking (also with an Averaged Perceptron) to identify arbitrary chunks based on training data.
.
* Document similarity; A cosine-based similarity measure, and TF-IDF calculations,
are available in the 'NLP.Similarity.VectorSim' module.
.
* Information Extraction patterns via () Parsec
.
Chatter comes with models for POS tagging and
Phrasal Chunking that have been trained on the
Brown corpus (POS only) and the Conll2000 corpus
(POS and Chunking)
homepage: http://github.com/creswick/chatter
Bug-Reports: http://github.com/creswick/chatter/issues
category: Natural language processing
license: BSD3
License-file: LICENSE
author: Rogan Creswick
maintainer: creswick@gmail.com
Cabal-Version: >=1.10
build-type: Simple
Extra-Source-Files: changelog.md
data-files: ./data/models/README
./data/models/brown.pos.model.gz
./data/models/conll2000.pos.model.gz
./data/models/conll2000.chunk.model.gz
./data/models/wikiner.ner.model.gz
source-repository head
type: git
location: git://github.com/creswick/chatter.git
Library
default-language: Haskell2010
hs-source-dirs: src
Other-modules: Paths_chatter
Exposed-modules: NLP.POS
NLP.POS.AvgPerceptronTagger
NLP.POS.LiteralTagger
NLP.POS.UnambiguousTagger
NLP.Chunk
NLP.Chunk.AvgPerceptronChunker
NLP.ML.AvgPerceptron
NLP.Types
NLP.Types.General
NLP.Types.IOB
NLP.Types.Tags
NLP.Types.Tree
NLP.Tokenize.Chatter
NLP.Corpora.Parsing
NLP.Corpora.Email
NLP.Corpora.Brown
NLP.Corpora.Conll
NLP.Corpora.WikiNer
NLP.Similarity.VectorSim
NLP.Extraction.Parsec
NLP.Extraction.Examples.ParsecExamples
Data.DefaultMap
Build-depends: base >= 4.6 && <= 6,
text >= 0.11.3.0,
containers >= 0.5.0.0,
random-shuffle >= 0.0.4,
MonadRandom >= 0.1.2,
cereal >= 0.4.0.1,
cereal-text >= 0.1 && < 0.2,
fullstop >= 0.1.3.1,
bytestring >= 0.10.0.0,
directory,
mbox >= 0.3,
zlib >= 0.5.4.1,
filepath >= 1.3.0.1,
deepseq,
tokenize >= 0.2.0,
parsec >= 3.1.5,
transformers,
regex-tdfa >= 1.2.0,
regex-tdfa-text,
array,
QuickCheck,
quickcheck-instances,
unordered-containers,
hashable
ghc-options: -Wall -auto-all -caf-all
Executable tagPOS
default-language: Haskell2010
Main-Is: Tagger.hs
hs-source-dirs: appsrc
Build-depends: chatter,
filepath >= 1.3.0.1,
text >= 0.11.3.0,
base >= 4.6 && <= 6,
bytestring >= 0.10.0.0,
cereal >= 0.4.0.1
ghc-options: -Wall -main-is Tagger -rtsopts
Executable trainPOS
default-language: Haskell2010
Main-Is: POSTrainer.hs
hs-source-dirs: appsrc
Build-depends: chatter,
filepath >= 1.3.0.1,
text >= 0.11.3.0,
base >= 4.6 && <= 6,
bytestring >= 0.10.0.0,
cereal >= 0.4.0.1,
containers >= 0.5.0.0
ghc-options: -Wall -main-is POSTrainer -rtsopts
Executable trainChunker
default-language: Haskell2010
Main-Is: ChunkTrainer.hs
hs-source-dirs: appsrc
Build-depends: chatter,
filepath >= 1.3.0.1,
text >= 0.11.3.0,
base >= 4.6 && <= 6,
bytestring >= 0.10.0.0,
cereal >= 0.4.0.1,
containers >= 0.5.0.0
ghc-options: -Wall -main-is ChunkTrainer -rtsopts
Executable trainNER
default-language: Haskell2010
Main-Is: NERTrainer.hs
hs-source-dirs: appsrc
Build-depends: chatter,
filepath >= 1.3.0.1,
text >= 0.11.3.0,
base >= 4.6 && <= 6,
bytestring >= 0.10.0.0,
cereal >= 0.4.0.1,
containers >= 0.5.0.0
ghc-options: -Wall -main-is NERTrainer -rtsopts
Executable eval
default-language: Haskell2010
Main-Is: Evaluate.hs
hs-source-dirs: appsrc
Build-depends: chatter,
filepath >= 1.3.0.1,
text >= 0.11.3.0,
base >= 4.6 && <= 6,
bytestring >= 0.10.0.0,
cereal >= 0.4.0.1,
containers >= 0.5.0.0
ghc-options: -Wall -main-is Evaluate -rtsopts
-- disabled because we aren't using it, and it's causing issues with `stack ghci`
-- benchmark bench
-- type: exitcode-stdio-1.0
-- default-language: Haskell2010
-- Main-Is: Bench.hs
-- hs-source-dirs: tests/src
-- Other-modules: NLP.Similarity.VectorSimBench
-- Corpora
-- Build-depends: chatter,
-- criterion >= 0.8.0.1,
-- filepath >= 1.3.0.1,
-- text >= 0.11.3.0,
-- base >= 4.6 && <= 6,
-- deepseq,
-- split >= 0.1.2.3,
-- tokenize >= 0.2.0
-- ghc-options: -Wall -main-is Bench
test-suite tests
default-language: Haskell2010
type: exitcode-stdio-1.0
Main-Is: Main.hs
hs-source-dirs: tests/src
Other-modules: AvgPerceptronTests
BackoffTaggerTests
NLP.Similarity.VectorSimTests
NLP.POSTests
NLP.POS.UnambiguousTaggerTests
NLP.Extraction.ParsecTests
NLP.TypesTests
Data.DefaultMapTests
Corpora
TestUtils
IntegrationTests
Build-depends: chatter,
base >= 4.6 && <= 6,
text,
HUnit,
parsec >= 3.1.5,
tokenize,
QuickCheck,
filepath,
cereal,
quickcheck-instances,
containers,
tasty,
tasty-quickcheck,
tasty-hunit,
tasty-ant-xml
ghc-options: -Wall -auto-all -caf-all