bloodhound-0.16.0.0: Elasticsearch client library for Haskell

Copyright(C) 2014 2018 Chris Allen
LicenseBSD-style (see the file LICENSE)
MaintainerChris Allen <cma@bitemyapp.com>
Stabilityprovisional
PortabilityOverloadedStrings
Safe HaskellNone
LanguageHaskell2010

Database.V5.Bloodhound.Client

Contents

Description

Client side functions for talking to Elasticsearch servers.

Synopsis

Bloodhound client functions

The examples in this module assume the following code has been run. The :{ and :} will only work in GHCi. You'll only need the data types and typeclass instances for the functions that make use of them.

>>> :set -XOverloadedStrings
>>> :set -XDeriveGeneric
>>> import Database.V5.Bloodhound
>>> import Network.HTTP.Client
>>> let testServer = (Server "http://localhost:9200")
>>> let runBH' = withBH defaultManagerSettings testServer
>>> let testIndex = IndexName "twitter"
>>> let testMapping = MappingName "tweet"
>>> let defaultIndexSettings = IndexSettings (ShardCount 1) (ReplicaCount 0)
>>> data TweetMapping = TweetMapping deriving (Eq, Show)
>>> _ <- runBH' $ deleteIndex testIndex
>>> _ <- runBH' $ deleteIndex (IndexName "didimakeanindex")
>>> import GHC.Generics
>>> import           Data.Time.Calendar        (Day (..))
>>> import Data.Time.Clock (UTCTime (..), secondsToDiffTime)
>>> :{
instance ToJSON TweetMapping where
         toJSON TweetMapping =
           object ["properties" .=
             object ["location" .=
               object ["type" .= ("geo_point" :: Text)]]]
data Location = Location { lat :: Double
                        , lon :: Double } deriving (Eq, Generic, Show)
data Tweet = Tweet { user     :: Text
                   , postDate :: UTCTime
                   , message  :: Text
                   , age      :: Int
                   , location :: Location } deriving (Eq, Generic, Show)
exampleTweet = Tweet { user     = "bitemyapp"
                     , postDate = UTCTime
                                  (ModifiedJulianDay 55000)
                                  (secondsToDiffTime 10)
                     , message  = "Use haskell!"
                     , age      = 10000
                     , location = Location 40.12 (-71.34) }
instance ToJSON   Tweet where
 toJSON = genericToJSON defaultOptions
instance FromJSON Tweet where
 parseJSON = genericParseJSON defaultOptions
instance ToJSON   Location where
 toJSON = genericToJSON defaultOptions
instance FromJSON Location where
 parseJSON = genericParseJSON defaultOptions
data BulkTest = BulkTest { name :: Text } deriving (Eq, Generic, Show)
instance FromJSON BulkTest where
 parseJSON = genericParseJSON defaultOptions
instance ToJSON BulkTest where
 toJSON = genericToJSON defaultOptions
:}

withBH :: ManagerSettings -> Server -> BH IO a -> IO a Source #

Convenience function that sets up a manager and BHEnv and runs the given set of bloodhound operations. Connections will be pipelined automatically in accordance with the given manager settings in IO. If you've got your own monad transformer stack, you should use runBH directly.

Indices

createIndex :: MonadBH m => IndexSettings -> IndexName -> m Reply Source #

createIndex will create an index given a Server, IndexSettings, and an IndexName.

>>> response <- runBH' $ createIndex defaultIndexSettings (IndexName "didimakeanindex")
>>> respIsTwoHunna response
True
>>> runBH' $ indexExists (IndexName "didimakeanindex")
True

createIndexWith Source #

Arguments

:: MonadBH m 
=> [UpdatableIndexSetting] 
-> Int

shard count

-> IndexName 
-> m Reply 

Create an index, providing it with any number of settings. This is more expressive than createIndex but makes is more verbose for the common case of configuring only the shard count and replica count.

flushIndex :: MonadBH m => IndexName -> m Reply Source #

flushIndex will flush an index given a Server and an IndexName.

deleteIndex :: MonadBH m => IndexName -> m Reply Source #

deleteIndex will delete an index given a Server and an IndexName.

>>> _ <- runBH' $ createIndex defaultIndexSettings (IndexName "didimakeanindex")
>>> response <- runBH' $ deleteIndex (IndexName "didimakeanindex")
>>> respIsTwoHunna response
True
>>> runBH' $ indexExists (IndexName "didimakeanindex")
False

updateIndexSettings :: MonadBH m => NonEmpty UpdatableIndexSetting -> IndexName -> m Reply Source #

updateIndexSettings will apply a non-empty list of setting updates to an index

>>> _ <- runBH' $ createIndex defaultIndexSettings (IndexName "unconfiguredindex")
>>> response <- runBH' $ updateIndexSettings (BlocksWrite False :| []) (IndexName "unconfiguredindex")
>>> respIsTwoHunna response
True

forceMergeIndex :: MonadBH m => IndexSelection -> ForceMergeIndexSettings -> m Reply Source #

forceMergeIndex

The force merge API allows to force merging of one or more indices through an API. The merge relates to the number of segments a Lucene index holds within each shard. The force merge operation allows to reduce the number of segments by merging them.

This call will block until the merge is complete. If the http connection is lost, the request will continue in the background, and any new requests will block until the previous force merge is complete.

indexExists :: MonadBH m => IndexName -> m Bool Source #

indexExists enables you to check if an index exists. Returns Bool in IO

>>> exists <- runBH' $ indexExists testIndex

openIndex :: MonadBH m => IndexName -> m Reply Source #

openIndex opens an index given a Server and an IndexName. Explained in further detail at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-open-close.html

>>> reply <- runBH' $ openIndex testIndex

closeIndex :: MonadBH m => IndexName -> m Reply Source #

closeIndex closes an index given a Server and an IndexName. Explained in further detail at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-open-close.html

>>> reply <- runBH' $ closeIndex testIndex

listIndices :: (MonadThrow m, MonadBH m) => m [IndexName] Source #

listIndices returns a list of all index names on a given Server

waitForYellowIndex :: MonadBH m => IndexName -> m Reply Source #

Block until the index becomes available for indexing documents. This is useful for integration tests in which indices are rapidly created and deleted.

Index Aliases

updateIndexAliases :: MonadBH m => NonEmpty IndexAliasAction -> m Reply Source #

updateIndexAliases updates the server's index alias table. Operations are atomic. Explained in further detail at https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html

>>> let src = IndexName "a-real-index"
>>> let aliasName = IndexName "an-alias"
>>> let iAlias = IndexAlias src (IndexAliasName aliasName)
>>> let aliasCreate = IndexAliasCreate Nothing Nothing
>>> _ <- runBH' $ deleteIndex src
>>> respIsTwoHunna <$> runBH' (createIndex defaultIndexSettings src)
True
>>> runBH' $ indexExists src
True
>>> respIsTwoHunna <$> runBH' (updateIndexAliases (AddAlias iAlias aliasCreate :| []))
True
>>> runBH' $ indexExists aliasName
True

getIndexAliases :: (MonadBH m, MonadThrow m) => m (Either EsError IndexAliasesSummary) Source #

Get all aliases configured on the server.

deleteIndexAlias :: MonadBH m => IndexAliasName -> m Reply Source #

Delete a single alias, removing it from all indices it is currently associated with.

Index Templates

putTemplate :: MonadBH m => IndexTemplate -> TemplateName -> m Reply Source #

putTemplate creates a template given an IndexTemplate and a TemplateName. Explained in further detail at https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-templates.html

>>> let idxTpl = IndexTemplate (TemplatePattern "tweet-*") (Just (IndexSettings (ShardCount 1) (ReplicaCount 1))) [toJSON TweetMapping]
>>> resp <- runBH' $ putTemplate idxTpl (TemplateName "tweet-tpl")

templateExists :: MonadBH m => TemplateName -> m Bool Source #

templateExists checks to see if a template exists.

>>> exists <- runBH' $ templateExists (TemplateName "tweet-tpl")

deleteTemplate :: MonadBH m => TemplateName -> m Reply Source #

deleteTemplate is an HTTP DELETE and deletes a template.

>>> let idxTpl = IndexTemplate (TemplatePattern "tweet-*") (Just (IndexSettings (ShardCount 1) (ReplicaCount 1))) [toJSON TweetMapping]
>>> _ <- runBH' $ putTemplate idxTpl (TemplateName "tweet-tpl")
>>> resp <- runBH' $ deleteTemplate (TemplateName "tweet-tpl")

Mapping

putMapping :: (MonadBH m, ToJSON a) => IndexName -> MappingName -> a -> m Reply Source #

putMapping is an HTTP PUT and has upsert semantics. Mappings are schemas for documents in indexes.

>>> _ <- runBH' $ createIndex defaultIndexSettings testIndex
>>> resp <- runBH' $ putMapping testIndex testMapping TweetMapping
>>> print resp
Response {responseStatus = Status {statusCode = 200, statusMessage = "OK"}, responseVersion = HTTP/1.1, responseHeaders = [("content-type","application/json; charset=UTF-8"),("content-encoding","gzip"),("transfer-encoding","chunked")], responseBody = "{\"acknowledged\":true}", responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}

Documents

indexDocument :: (ToJSON doc, MonadBH m) => IndexName -> MappingName -> IndexDocumentSettings -> doc -> DocId -> m Reply Source #

indexDocument is the primary way to save a single document in Elasticsearch. The document itself is simply something we can convert into a JSON Value. The DocId will function as the primary key for the document. You are encouraged to generate your own id's and not rely on Elasticsearch's automatic id generation. Read more about it here: https://github.com/bitemyapp/bloodhound/issues/107

>>> resp <- runBH' $ indexDocument testIndex testMapping defaultIndexDocumentSettings exampleTweet (DocId "1")
>>> print resp
Response {responseStatus = Status {statusCode = 201, statusMessage = "Created"}, responseVersion = HTTP/1.1, responseHeaders = [("Location","/twitter/tweet/1"),("content-type","application/json; charset=UTF-8"),("content-encoding","gzip"),("transfer-encoding","chunked")], responseBody = "{\"_index\":\"twitter\",\"_type\":\"tweet\",\"_id\":\"1\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":2,\"successful\":1,\"failed\":0},\"created\":true}", responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}

updateDocument :: (ToJSON patch, MonadBH m) => IndexName -> MappingName -> IndexDocumentSettings -> patch -> DocId -> m Reply Source #

updateDocument provides a way to perform an partial update of a an already indexed document.

getDocument :: MonadBH m => IndexName -> MappingName -> DocId -> m Reply Source #

getDocument is a straight-forward way to fetch a single document from Elasticsearch using a Server, IndexName, MappingName, and a DocId. The DocId is the primary key for your Elasticsearch document.

>>> yourDoc <- runBH' $ getDocument testIndex testMapping (DocId "1")

documentExists :: MonadBH m => IndexName -> MappingName -> Maybe DocumentParent -> DocId -> m Bool Source #

documentExists enables you to check if a document exists. Returns Bool in IO

>>> exists <- runBH' $ documentExists testIndex testMapping Nothing (DocId "1")

deleteDocument :: MonadBH m => IndexName -> MappingName -> DocId -> m Reply Source #

deleteDocument is the primary way to delete a single document.

>>> _ <- runBH' $ deleteDocument testIndex testMapping (DocId "1")

Searching

searchAll :: MonadBH m => Search -> m Reply Source #

searchAll, given a Search, will perform that search against all indexes on an Elasticsearch server. Try to avoid doing this if it can be helped.

>>> let query = TermQuery (Term "user" "bitemyapp") Nothing
>>> let search = mkSearch (Just query) Nothing
>>> reply <- runBH' $ searchAll search

searchByIndex :: MonadBH m => IndexName -> Search -> m Reply Source #

searchByIndex, given a Search and an IndexName, will perform that search against all mappings within an index on an Elasticsearch server.

>>> let query = TermQuery (Term "user" "bitemyapp") Nothing
>>> let search = mkSearch (Just query) Nothing
>>> reply <- runBH' $ searchByIndex testIndex search

searchByIndices :: MonadBH m => NonEmpty IndexName -> Search -> m Reply Source #

searchByIndices is a variant of searchByIndex that executes a Search over many indices. This is much faster than using mapM to searchByIndex over a collection since it only causes a single HTTP request to be emitted.

searchByType :: MonadBH m => IndexName -> MappingName -> Search -> m Reply Source #

searchByType, given a Search, IndexName, and MappingName, will perform that search against a specific mapping within an index on an Elasticsearch server.

>>> let query = TermQuery (Term "user" "bitemyapp") Nothing
>>> let search = mkSearch (Just query) Nothing
>>> reply <- runBH' $ searchByType testIndex testMapping search

scanSearch :: (FromJSON a, MonadBH m, MonadThrow m) => IndexName -> MappingName -> Search -> m [Hit a] Source #

scanSearch uses the scroll API of elastic, for a given IndexName and MappingName. Note that this will consume the entire search result set and will be doing O(n) list appends so this may not be suitable for large result sets. In that case, getInitialScroll and advanceScroll are good low level tools. You should be able to hook them up trivially to conduit, pipes, or your favorite streaming IO abstraction of choice. Note that ordering on the search would destroy performance and thus is ignored.

getInitialScroll :: (FromJSON a, MonadThrow m, MonadBH m) => IndexName -> MappingName -> Search -> m (Either EsError (SearchResult a)) Source #

For a given search, request a scroll for efficient streaming of search results. Note that the search is put into SearchTypeScan mode and thus results will not be sorted. Combine this with advanceScroll to efficiently stream through the full result set

getInitialSortedScroll :: (FromJSON a, MonadThrow m, MonadBH m) => IndexName -> MappingName -> Search -> m (Either EsError (SearchResult a)) Source #

For a given search, request a scroll for efficient streaming of search results. Combine this with advanceScroll to efficiently stream through the full result set. Note that this search respects sorting and may be less efficient than getInitialScroll.

advanceScroll Source #

Arguments

:: (FromJSON a, MonadBH m, MonadThrow m) 
=> ScrollId 
-> NominalDiffTime

How long should the snapshot of data be kept around? This timeout is updated every time advanceScroll is used, so don't feel the need to set it to the entire duration of your search processing. Note that durations < 1s will be rounded up. Also note that NominalDiffTime is an instance of Num so literals like 60 will be interpreted as seconds. 60s is a reasonable default.

-> m (Either EsError (SearchResult a)) 

Use the given scroll to fetch the next page of documents. If there are no further pages, 'SearchResult.searchHits.hits' will be '[]'.

refreshIndex :: MonadBH m => IndexName -> m Reply Source #

refreshIndex will force a refresh on an index. You must do this if you want to read what you wrote.

>>> _ <- runBH' $ createIndex defaultIndexSettings testIndex
>>> _ <- runBH' $ refreshIndex testIndex

mkSearch :: Maybe Query -> Maybe Filter -> Search Source #

mkSearch is a helper function for defaulting additional fields of a Search to Nothing in case you only care about your Query and Filter. Use record update syntax if you want to add things like aggregations or highlights while still using this helper function.

>>> let query = TermQuery (Term "user" "bitemyapp") Nothing
>>> mkSearch (Just query) Nothing
Search {queryBody = Just (TermQuery (Term {termField = "user", termValue = "bitemyapp"}) Nothing), filterBody = Nothing, sortBody = Nothing, aggBody = Nothing, highlight = Nothing, trackSortScores = False, from = From 0, size = Size 10, searchType = SearchTypeQueryThenFetch, fields = Nothing, source = Nothing}

mkAggregateSearch :: Maybe Query -> Aggregations -> Search Source #

mkAggregateSearch is a helper function that defaults everything in a Search except for the Query and the Aggregation.

>>> let terms = TermsAgg $ (mkTermsAggregation "user") { termCollectMode = Just BreadthFirst }
>>> terms
TermsAgg (TermsAggregation {term = Left "user", termInclude = Nothing, termExclude = Nothing, termOrder = Nothing, termMinDocCount = Nothing, termSize = Nothing, termShardSize = Nothing, termCollectMode = Just BreadthFirst, termExecutionHint = Nothing, termAggs = Nothing})
>>> let myAggregation = mkAggregateSearch Nothing $ mkAggregations "users" terms

mkHighlightSearch :: Maybe Query -> Highlights -> Search Source #

mkHighlightSearch is a helper function that defaults everything in a Search except for the Query and the Aggregation.

>>> let query = QueryMatchQuery $ mkMatchQuery (FieldName "_all") (QueryString "haskell")
>>> let testHighlight = Highlights Nothing [FieldHighlight (FieldName "message") Nothing]
>>> let search = mkHighlightSearch (Just query) testHighlight

bulk :: MonadBH m => Vector BulkOperation -> m Reply Source #

bulk uses Elasticsearch's bulk API to perform bulk operations. The BulkOperation data type encodes the index/update/delete/create operations. You pass a Vector of BulkOperations and a Server to bulk in order to send those operations up to your Elasticsearch server to be performed. I changed from [BulkOperation] to a Vector due to memory overhead.

>>> let stream = V.fromList [BulkIndex testIndex testMapping (DocId "2") (toJSON (BulkTest "blah"))]
>>> _ <- runBH' $ bulk stream
>>> _ <- runBH' $ refreshIndex testIndex

pageSearch Source #

Arguments

:: From

The result offset

-> Size

The number of results to return

-> Search

The current seach

-> Search

The paged search

pageSearch is a helper function that takes a search and assigns the from and size fields for the search. The from parameter defines the offset from the first result you want to fetch. The size parameter allows you to configure the maximum amount of hits to be returned.

>>> let query = QueryMatchQuery $ mkMatchQuery (FieldName "_all") (QueryString "haskell")
>>> let search = mkSearch (Just query) Nothing
>>> search
Search {queryBody = Just (QueryMatchQuery (MatchQuery {matchQueryField = FieldName "_all", matchQueryQueryString = QueryString "haskell", matchQueryOperator = Or, matchQueryZeroTerms = ZeroTermsNone, matchQueryCutoffFrequency = Nothing, matchQueryMatchType = Nothing, matchQueryAnalyzer = Nothing, matchQueryMaxExpansions = Nothing, matchQueryLenient = Nothing, matchQueryBoost = Nothing})), filterBody = Nothing, sortBody = Nothing, aggBody = Nothing, highlight = Nothing, trackSortScores = False, from = From 0, size = Size 10, searchType = SearchTypeQueryThenFetch, fields = Nothing, source = Nothing}
>>> pageSearch (From 10) (Size 100) search
Search {queryBody = Just (QueryMatchQuery (MatchQuery {matchQueryField = FieldName "_all", matchQueryQueryString = QueryString "haskell", matchQueryOperator = Or, matchQueryZeroTerms = ZeroTermsNone, matchQueryCutoffFrequency = Nothing, matchQueryMatchType = Nothing, matchQueryAnalyzer = Nothing, matchQueryMaxExpansions = Nothing, matchQueryLenient = Nothing, matchQueryBoost = Nothing})), filterBody = Nothing, sortBody = Nothing, aggBody = Nothing, highlight = Nothing, trackSortScores = False, from = From 10, size = Size 100, searchType = SearchTypeQueryThenFetch, fields = Nothing, source = Nothing}

mkShardCount :: Int -> Maybe ShardCount Source #

mkShardCount is a straight-forward smart constructor for ShardCount which rejects Int values below 1 and above 1000.

>>> mkShardCount 10
Just (ShardCount 10)

mkReplicaCount :: Int -> Maybe ReplicaCount Source #

mkReplicaCount is a straight-forward smart constructor for ReplicaCount which rejects Int values below 0 and above 1000.

>>> mkReplicaCount 10
Just (ReplicaCount 10)

getStatus :: MonadBH m => m (Maybe Status) Source #

getStatus fetches the Status of a Server

>>> serverStatus <- runBH' getStatus
>>> fmap tagline (serverStatus)
Just "You Know, for Search"

Snapshot/Restore

Snapshot Repos

getSnapshotRepos :: (MonadBH m, MonadThrow m) => SnapshotRepoSelection -> m (Either EsError [GenericSnapshotRepo]) Source #

getSnapshotRepos gets the definitions of a subset of the defined snapshot repos.

updateSnapshotRepo Source #

Arguments

:: (MonadBH m, SnapshotRepo repo) 
=> SnapshotRepoUpdateSettings

Use defaultSnapshotRepoUpdateSettings if unsure

-> repo 
-> m Reply 

Create or update a snapshot repo

verifySnapshotRepo :: (MonadBH m, MonadThrow m) => SnapshotRepoName -> m (Either EsError SnapshotVerification) Source #

Verify if a snapshot repo is working. NOTE: this API did not make it into Elasticsearch until 1.4. If you use an older version, you will get an error here.

Snapshots

getSnapshots :: (MonadBH m, MonadThrow m) => SnapshotRepoName -> SnapshotSelection -> m (Either EsError [SnapshotInfo]) Source #

Get info about known snapshots given a pattern and repo name.

deleteSnapshot :: MonadBH m => SnapshotRepoName -> SnapshotName -> m Reply Source #

Delete a snapshot. Cancels if it is running.

Restoring Snapshots

restoreSnapshot Source #

Arguments

:: MonadBH m 
=> SnapshotRepoName 
-> SnapshotName 
-> SnapshotRestoreSettings

Start with defaultSnapshotRestoreSettings and customize from there for reasonable defaults.

-> m Reply 

Nodes

Request Utilities

encodeBulkOperations :: Vector BulkOperation -> ByteString Source #

encodeBulkOperations is a convenience function for dumping a vector of BulkOperation into an ByteString

>>> let bulkOps = V.fromList [BulkIndex testIndex testMapping (DocId "2") (toJSON (BulkTest "blah"))]
>>> encodeBulkOperations bulkOps
"\n{\"index\":{\"_type\":\"tweet\",\"_id\":\"2\",\"_index\":\"twitter\"}}\n{\"name\":\"blah\"}\n"

encodeBulkOperation :: BulkOperation -> ByteString Source #

encodeBulkOperation is a convenience function for dumping a single BulkOperation into an ByteString

>>> let bulkOp = BulkIndex testIndex testMapping (DocId "2") (toJSON (BulkTest "blah"))
>>> encodeBulkOperation bulkOp
"{\"index\":{\"_type\":\"tweet\",\"_id\":\"2\",\"_index\":\"twitter\"}}\n{\"name\":\"blah\"}"

Authentication

basicAuthHook :: Monad m => EsUsername -> EsPassword -> Request -> m Request Source #

This is a hook that can be set via the bhRequestHook function that will authenticate all requests using an HTTP Basic Authentication header. Note that it is *strongly* recommended that this option only be used over an SSL connection.

> (mkBHEnv myServer myManager) { bhRequestHook = basicAuthHook (EsUsername "myuser") (EsPassword "mypass") }

Reply-handling tools

isVersionConflict :: Reply -> Bool Source #

Was there an optimistic concurrency control conflict when indexing a document?

parseEsResponse :: (MonadThrow m, FromJSON a) => Reply -> m (Either EsError a) Source #

Tries to parse a response body as the expected type a and failing that tries to parse it as an EsError. All well-formed, JSON responses from elasticsearch should fall into these two categories. If they don't, a EsProtocolException will be thrown. If you encounter this, please report the full body it reports along with your Elasticsearch verison.