hext-0.1.0.2: a text classification library

Safe HaskellSafe
LanguageHaskell2010

NLP.Hext.NaiveBayes

Contents

Synopsis

Documentation

makeMaterial Source #

Arguments

:: [(String, a)]

a list of text samples and their corresponding classes

-> Material a 

Creates learning material for the program combining samples and their corresponding classes into a Labeled datatype.

runBayes Source #

Arguments

:: Eq a 
=> Material a

learning material made with makeMaterial

-> String

the sample string to be classified

-> a

a datatype representing a class to classify text

Runs a sample string through the Naive Bayes algorithm using training material made by makeMaterial

data Classified a Source #

A class which has a specific probability of occuring

Constructors

Classified 

Fields

type Material a = [Labeled a] Source #

A list of labeled data

data Labeled a Source #

A frequency list of words that has been assigned a class

Constructors

Labeled 

Fields

type FList = Map Text Int Source #

A frequency list of words

Example: Simple Usage

In this example a list of sample reviews and their corresponding classes are zipped into an association list to be passed into the makeMaterial function. This newly created material is then passed into the runBayes function, along with a new review. This will classify the new review based on the training material that has been given.

data Class = Positive | Negative deriving (Eq, Show)
doc1 = "I loved the movie"
doc2 = "I hated the movie"
doc3 = "a great movie. good movie"
doc4 = "poor acting"
doc5 = "great acting. a good movie"
docs = [doc1, doc2, doc3, doc4, doc5]
correspondingClasses = [Positive, Negative, Positive, Negative, Positive]
classifiedDocs = zip docs correspondingClasses
main :: IO ()
main = do
    let material = makeMaterial classifiedDocs
    let review = "I loved the great acting"
    let result = runBayes material review
    
    putStrLn $ "The review '" ++ review ++ "' is " ++ show result