Copyright | (c) Bryan O'Sullivan 2007 JP Bernardy 2010 |
---|---|
License | BSD-style |
Maintainer | nobody |
Stability | experimental |
Portability | portable |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
A suffix tree cluster implementation.
Since many function names (but not the type name) clash with
Prelude names, this module is usually imported qualified
, e.g.
import Data.SuffixTreeCluster (STree) import qualified Data.SuffixTreeCluster as T
This implementation constructs the suffix tree lazily, so subtrees are not created until they are traversed.
Estimates are given for performance. The value k is a constant; n is the length of a query string; and t is the number of elements (nodes and leaves) in a suffix tree.
Synopsis
- type Alphabet a = [a]
- type Edge a b = (Prefix a, STree a b)
- data Prefix a
- data STree a b = Node b [Edge a b]
- construct :: (Eq b, Monoid b, Ord a) => [([a], b)] -> STree a b
- elem :: Eq a => [a] -> STree a b -> Bool
- findEdge :: Eq a => [a] -> STree a b -> Maybe (Edge a b, Int)
- findTree :: Eq a => [a] -> STree a b -> Maybe (STree a b)
- findPath :: Eq a => [a] -> STree a b -> [Edge a b]
- mkPrefix :: [a] -> Prefix a
- prefix :: Prefix a -> [a]
- suffixes :: [a] -> [[a]]
- select :: Eq b => ([a], STree a [b]) -> Tree ([a], [b])
- commonStrings :: (Eq b, Ord a, Ord b) => [([a], [b])] -> Map [b] [[a]]
- printTree :: Show a => Tree a -> IO ()
Types
The prefix string associated with an Edge
. Use mkPrefix
to
create a value of this type, and prefix
to deconstruct one.
The suffix tree type. The implementation is exposed to ease the
development of custom traversal functions. Note that (
pairs are not stored in any order.Prefix
a,
STree
a b)
Construction
Querying
elem :: Eq a => [a] -> STree a b -> Bool Source #
O(n). Indicates whether the suffix tree contains the given sublist. Performance is linear in the length n of the sublist.
findEdge :: Eq a => [a] -> STree a b -> Maybe (Edge a b, Int) Source #
O(n). Finds the given subsequence in the suffix tree. On
failure, returns Nothing
. On success, returns the Edge
in the
suffix tree at which the subsequence ends, along with the number of
elements to drop from the prefix of the Edge
to get the "real"
remaining prefix.
Here is an example:
> find "ssip" (construct "mississippi") Just ((mkPrefix "ppi",Leaf),1)
This indicates that the edge (
matches,
and that we must strip 1 character from the string mkPrefix
"ppi",Leaf
)"ppi"
to get
the remaining prefix string "pi"
.
Performance is linear in the length n of the query list.
findTree :: Eq a => [a] -> STree a b -> Maybe (STree a b) Source #
O(n). Finds the subtree rooted at the end of the given query
sequence. On failure, returns Nothing
.
Performance is linear in the length n of the query list.
findPath :: Eq a => [a] -> STree a b -> [Edge a b] Source #
O(n). Returns the path from the Edge
in the suffix tree at
which the given subsequence ends, up to the root of the tree. If
the subsequence is not found, returns the empty list.
Performance is linear in the length of the query list.
Other useful functions
suffixes :: [a] -> [[a]] Source #
O(n). Returns all non-empty suffixes of the argument, longest first. Behaves as follows:
suffixes xs == init (tails xs)