talash: Line oriented fast enough text search
This library provides searching a large number of candidates against a query using a given style. Two styles are provided. The default is orderless style in which a match occurs if the words in the query all occur in the candidate regardless of the order of their occurrence. A fuzzy style is also provided in which a candidate matches if all the characters of the query occur in it in order.
There is also a TUI searcher/selector interface provided using a brick app. Like an extremely
barebones version of
fzf and mostly intended to be a starting point that has to be configured according to the needs or else it can be embedded into other
applications to provide a selection interface.
There is also a piped searcher/seeker provided in which searcher runs in the background and can be used by a seeker communicating with it using named pipes.
The is also a demo executable for both the brick app and piped version that gets the candidates for the
talash help for usage information.
Some care has been taken to make the searcher performant. On my laptop searching using the tui app to search all files in my
60K items, the search results appear almost instantly. Searching among about
340K files in
/usr/share there is some but bearable lag
between the keypresses and the search results. While searching between
1264K files, that
fd finds from
/ there is a lag of a second or so
before the results appear. The three scenarios consume about
50 MiB ,
130 MiB and
500 MiB of memory respectively which is almost entirely due
Vector Text storing the candidates.
The nice string matching interface provided by alfred-margaret is responsible for a big part of the performance. While vector-sized is responsible for most of memory efficieny. Performance can potentially be further improved by using all the cores but it is good enough for my typical use cases of searching among a few thousand or at most a few tens of thousands of candidates. As a result parallel matching is unlikely to be implemented.
The package is lightly maintained, bugs reports are welcome but any action on them will be slow. Patches are welcome for 1. bugfixes 2. simple performance improvements 3. Adding mouse bindings to tui 4. New search styles, especially a better fuzzy one, that matches each word in the query fuzzily but the words themselves can be matched in any order (I am not sure what is a sensible implementation of this).
Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.
|Versions [RSS]||0.1.0.0, 0.1.0.1, 0.1.1.0, 0.1.1.1|
|Dependencies||alfred-margaret (>=184.108.40.206 && <1.2), base (>=4.10.1 && <5), brick (>=0.60 && <0.70), bytestring (>=0.10.8 && <0.11), colorful-monoids (>=0.2.1 && <0.3), directory (>=1.3.6 && <1.4), ghc-compact (>=0.1.0 && <0.3), intro (>=0.4.0 && <0.10), microlens (>=0.4.0 && <0.5), microlens-th (>=0.4.0 && <0.5), talash, text (>=1.2.3 && <1.3), unix (>=2.7.2 && <2.8), unordered-containers (>=0.2.9 && <0.3), vector (>=0.12.1 && <0.13), vector-algorithms (>=0.8.0.3 && <0.9), vector-sized (>=1.4.0 && <1.5), vty (>=5.33 && <5.34) [details]|
|Revised||Revision 1 made by rahguzar at 2022-05-28T13:31:36Z|
|Uploaded||by rahguzar at 2021-06-06T13:57:20Z|
|Downloads||322 total (15 in the last 30 days)|
|Rating||(no votes yet) [estimated by Bayesian average]|
|Status||Docs available [build log]
Last success reported on 2021-06-06 [all 1 reports]