Safe Haskell | None |
---|---|
Language | Haskell2010 |
- type Term = Text
- query :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchEngine doc key field feature -> [Term] -> [key]
- queryAutosuggest :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchEngine doc key field feature -> ResultsFilter key -> [Term] -> Term -> ([(Term, Float)], [(key, Float)])
- data ResultsFilter key
- = NoFilter
- | FilterPredicate (key -> Bool)
- | FilterBulkPredicate ([key] -> [Bool])
- initSearchEngine :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchConfig doc key field feature -> SearchRankParameters field feature -> SearchEngine doc key field feature
- data SearchEngine doc key field feature
- data SearchConfig doc key field feature = SearchConfig {
- documentKey :: doc -> key
- extractDocumentTerms :: doc -> field -> [Term]
- transformQueryTerm :: Term -> field -> Term
- documentFeatureValue :: doc -> feature -> Float
- data SearchRankParameters field feature = SearchRankParameters {
- paramK1 :: !Float
- paramB :: field -> Float
- paramFieldWeights :: field -> Float
- paramFeatureWeights :: feature -> Float
- paramFeatureFunctions :: feature -> FeatureFunction
- paramResultsetSoftLimit :: !Int
- paramResultsetHardLimit :: !Int
- paramAutosuggestPrefilterLimit :: !Int
- paramAutosuggestPostfilterLimit :: !Int
- data FeatureFunction
- data NoFeatures
- noFeatures :: NoFeatures -> a
- insertDoc :: (Ord key, Ix field, Bounded field, Ix feature, Bounded feature) => doc -> SearchEngine doc key field feature -> SearchEngine doc key field feature
- insertDocs :: (Ord key, Ix field, Bounded field, Ix feature, Bounded feature) => [doc] -> SearchEngine doc key field feature -> SearchEngine doc key field feature
- deleteDoc :: (Ord key, Ix field, Bounded field) => key -> SearchEngine doc key field feature -> SearchEngine doc key field feature
- queryExplain :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchEngine doc key field feature -> [Term] -> [(Explanation field feature Term, key)]
- data Explanation field feature term = Explanation {
- overallScore :: Float
- termScores :: [(term, Float)]
- nonTermScores :: [(feature, Float)]
- termFieldScores :: [(term, [(field, Float)])]
- setRankParams :: SearchRankParameters field feature -> SearchEngine doc key field feature -> SearchEngine doc key field feature
- invariant :: (Ord key, Ix field, Bounded field) => SearchEngine doc key field feature -> Bool
Basic interface
Querying
query :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchEngine doc key field feature -> [Term] -> [key] Source #
Execute a normal query. Find the documents in which one or more of the search terms appear and return them in ranked order.
The number of documents returned is limited by the paramResultsetSoftLimit
and paramResultsetHardLimit
paramaters. This also limits the cost of the
query (which is primarily the cost of scoring each document).
The given terms are all assumed to be complete (as opposed to prefixes
like with queryAutosuggest
).
Query auto-completion / auto-suggestion
queryAutosuggest :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchEngine doc key field feature -> ResultsFilter key -> [Term] -> Term -> ([(Term, Float)], [(key, Float)]) Source #
Execute an "auto-suggest" query. This is where one of the search terms is an incomplete prefix and we are looking for possible completions of that search term, and result documents to go with the possible completions.
An auto-suggest query only gives useful results when the SearchEngine
is
configured to use a non-term feature score. That is, when we can give
documents an importance score independent of what terms we are looking for.
This is because an auto-suggest query is backwards from a normal query: we
are asking for relevant terms occurring in important or popular documents
so we need some notion of important or popular. Without this we would just
be ranking based on term frequency which while it makes sense for normal
"forward" queries is pretty meaningless for auto-suggest "reverse"
queries. Indeed for single-term auto-suggest queries the ranking function
we use will assign 0 for all documents and completions if there is no
non-term feature scores.
data ResultsFilter key Source #
In some applications it is necessary to enforce some security or visibility rule about the query results (e.g. in a typical DB-based application different users can see different data items). Typically it would be too expensive to build different search indexes for the different contexts and so the strategy is to use one index containing everything and filter for visibility in the results. This means the filter condition is different for different queries (e.g. performed on behalf of different users).
Filtering the results after a query is possible but not the most efficient thing to do because we've had to score all the not-visible documents. The better thing to do is to filter as part of the query, this way we can filter before the expensive scoring.
We provide one further optimisation: bulk predicates. In some applications it can be quicker to check the security/visibility of a whole bunch of results all in one go.
NoFilter | |
FilterPredicate (key -> Bool) | |
FilterBulkPredicate ([key] -> [Bool]) |
Making a search engine instance
initSearchEngine :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchConfig doc key field feature -> SearchRankParameters field feature -> SearchEngine doc key field feature Source #
data SearchEngine doc key field feature Source #
data SearchConfig doc key field feature Source #
SearchConfig | |
|
data SearchRankParameters field feature Source #
SearchRankParameters | |
|
data FeatureFunction Source #
LogarithmicFunction Float | log (lambda_i + f_i) |
RationalFunction Float | f_i / (lambda_i + f_i) |
SigmoidFunction Float Float | 1 / (lambda + exp(-(lambda' * f_i)) |
Helper type for non-term features
data NoFeatures Source #
noFeatures :: NoFeatures -> a Source #
Managing documents to be searched
insertDoc :: (Ord key, Ix field, Bounded field, Ix feature, Bounded feature) => doc -> SearchEngine doc key field feature -> SearchEngine doc key field feature Source #
insertDocs :: (Ord key, Ix field, Bounded field, Ix feature, Bounded feature) => [doc] -> SearchEngine doc key field feature -> SearchEngine doc key field feature Source #
deleteDoc :: (Ord key, Ix field, Bounded field) => key -> SearchEngine doc key field feature -> SearchEngine doc key field feature Source #
Explain mode for query result rankings
queryExplain :: (Ix field, Bounded field, Ix feature, Bounded feature) => SearchEngine doc key field feature -> [Term] -> [(Explanation field feature Term, key)] Source #
data Explanation field feature term Source #
A breakdown of the BM25F score, to explain somewhat how it relates to the inputs, and so you can compare the scores of different documents.
Explanation | |
|
setRankParams :: SearchRankParameters field feature -> SearchEngine doc key field feature -> SearchEngine doc key field feature Source #