full-text-search-0.2.0.0: In-memory full text search engine

Safe HaskellNone

Data.SearchEngine

Contents

Synopsis

Basic interface

Querying

type Term = TextSource

Terms are short strings, usually whole words.

query :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchEngine doc key field feature -> [Term] -> [key]Source

Making a search engine instance

initSearchEngine :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchConfig doc key field feature -> SearchRankParameters field feature -> SearchEngine doc key field featureSource

data SearchEngine doc key field feature Source

data SearchConfig doc key field feature Source

Constructors

SearchConfig 

Fields

documentKey :: doc -> key
 
extractDocumentTerms :: doc -> field -> [Term]
 
transformQueryTerm :: Term -> field -> Term
 
documentFeatureValue :: doc -> feature -> Float
 

data FeatureFunction Source

Constructors

LogarithmicFunction Float
log (lambda_i + f_i)
RationalFunction Float
f_i / (lambda_i + f_i)
SigmoidFunction Float Float
1 / (lambda + exp(-(lambda' * f_i))

Helper type for non-term features

Managing documents to be searched

insertDoc :: (Ord key, Ix field, Bounded field, Ix feature, Bounded feature) => doc -> SearchEngine doc key field feature -> SearchEngine doc key field featureSource

insertDocs :: (Ord key, Ix field, Bounded field, Ix feature, Bounded feature) => [doc] -> SearchEngine doc key field feature -> SearchEngine doc key field featureSource

deleteDoc :: (Ord key, Ix field, Bounded field) => key -> SearchEngine doc key field feature -> SearchEngine doc key field featureSource

Explain mode for query result rankings

queryExplain :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchEngine doc key field feature -> [Term] -> [(Explanation field feature Term, key)]Source

data Explanation field feature term Source

A breakdown of the BM25F score, to explain somewhat how it relates to the inputs, and so you can compare the scores of different documents.

Constructors

Explanation 

Fields

overallScore :: Float

The overall score is the sum of the termScores, positionScore and nonTermScore

termScores :: [(term, Float)]

There is a score contribution from each query term. This is the score for the term across all fields in the document (but see termFieldScores).

nonTermScores :: [(feature, Float)]

The document can have an inate bonus score independent of the terms in the query. For example this might be a popularity score.

termFieldScores :: [(term, [(field, Float)])]

This does not contribute to the overallScore. It is an indication of how the termScores relates to per-field scores. Note however that the term score for all fields is not simply sum of the per-field scores. The point of the BM25F scoring function is that a linear combination of per-field scores is wrong, and BM25F does a more cunning non-linear combination.

However, it is still useful as an indication to see scores for each field for a term, to see how the compare.

Instances

Functor (Explanation field feature) 
(Show field, Show feature, Show term) => Show (Explanation field feature term) 

setRankParams :: SearchRankParameters field feature -> SearchEngine doc key field feature -> SearchEngine doc key field featureSource

Internal sanity check

invariant :: (Ord key, Ix field, Bounded field) => SearchEngine doc key field feature -> BoolSource