# Type-safe BST and AVL trees Several implementations of binary search trees (BST) and balanced binary search trees (AVL). Verify at compile time if the operations over BST/AVL trees preserves the order of the keys and the balance in the heights. For both of them there's one unsafe implementation and three type-safe implementations. The last ones differ one how the structural invariants are handled at the type level. ## Summary - [Prerequisites](#prerequisites) - [Installing](#installing) - [Project Structure](#project-structure) - [Interface](#interface) - [Examples](#examples) - [Benchmark](#benchmark) ## Prerequisites ```Shell ghc (>= 8.10.2) # GHC Haskell compiler cabal (>=3.0.0.0) # Haskell package manager python3 (>=3.6.8) # Python interpreter (only needed for running the benchmarks) Some python package manager (e.g., pip) ``` No external Haskell libraries are needed. ## Installing If you only want to use the library (without benchmarks), the easiest way to install it is through [Hackage](https://hackage.haskell.org/). The following command will download the library and install it: ```Shell cabal install --lib balanced-binary-search-tree ``` If you also want to see or run the benchmarks, clone the repository: ```Shell git clone https://github.com/nico-rodriguez/balanced-binary-search-tree.git ``` The project may be used locally (inside the folder with `cabal repl`) or it may be install for system wide use: ```Shell cd balanced-binary-search-tree cabal install --lib ``` For usage examples see the [Examples](#examples) section. ## Project Structure ```Shell balanced-binary-search-tree │ README.md └───benchmark | │ │ └── ... └───src/Data/Tree │ ITree.hs │ Node.hs └───AVL │ │ FullExtern.hs │ │ Extern.hs │ │ Intern.hs │ │ Unsafe.hs │ │ Invariants.hs │ └───FullExtern │ │ │ Examples.hs │ └───Extern │ │ │ Constructor.hs │ │ │ Balance.hs, BalanceProofs.hs │ │ │ Insert.hs, InsertProofs.hs │ │ │ Lookup.hs │ │ │ Delete.hs, DeleteProofs.hs │ │ │ Examples.hs │ └───Intern │ │ │ Constructor.hs │ │ │ Balance.hs │ │ │ Insert.hs │ │ │ Lookup.hs │ │ │ Delete.hs │ │ │ Examples.hs │ └───Unsafe │ │ Examples.hs └───BST │ └── ... ``` - For details on the `benchmark` directory, see the [Benchmark](#benchmark) section. - `ITree.hs` implements the `Tree` and `ITree` data types. - `Node.hs` implements the nodes of the trees. - Both `ITree.hs` and `Node.hs` are used in both BST and AVL type-safe trees. - The structure of `Data/Tree/AVL/` and `Data/Tree/BST/` is similar. - `Data/Tree/{BST,AVL}/Invariants.hs` implements the BST/AVL invariants. - `Data/Tree/{BST,AVL}/Unsafe.hs` contains an unsafe implementation of BST/AVL trees. The code was extracted and refactored from `Data/Tree/AVL/Extern/{Balance,Insert,Lookup,Delete}.hs`, 'un-lifting' the type level computations to the value level and removing all proof functions and terms. - `Data/Tree/{BST,AVL}/FullExtern.hs` contains the implementation of the full externalist approach. It provides functionality for performing several operations in a row over trees and checking the invariants at the end. - `Data/Tree/{BST,AVL}/Extern/` contains the implementation of BST/AVL trees and its operations for the externalist approach; likewise, `Data/Tree/{BST,AVL}/Intern/` folder contains the implementation for the internalist approach. Notice that there are no `*Proofs.hs` source files inside `Data/Tree/{BST,AVL}/Intern/`. That's because the proofs and operations in the internalist approach are implemented together. - All four approaches, `Data/Tree/{BST,AVL}/{Unsafe,FullExtern,Extern,Intern}/`, have an `Examples.hs` with usage examples of the BST/AVL operations. - In order to use BST/AVL trees, only one of `Data/Tree/{BST,AVL}/{Unsafe,FullExtern,Extern,Intern}.hs` need to be imported. They all export the basic functionality for inserting, looking and deleting on the corresponding BST/AVL tree. See the [Interface](#interface) section for exploring the functions exported by each of them. ## Interface Both externalist and internalist approaches have a common interface for manipulating BST/AVL trees. The difference in their implementation is the approach taken when defining the structural invariants that represent the conditions for a tree to be BST/AVL. The interface for the unsafe and the full externalist approaches are only slightly different. ### AVL trees (some function constraints omitted) For the unsafe approach, the interface is - `emptyAVL :: AVL a`, an empty ITree. - `insertAVL :: Show a => Int -> a -> AVL a -> AVL a`, inserts a node in a ITree. If the tree already has a node with the given key, the value is updated. - `lookupAVL :: Int -> AVL a -> Maybe a`, returns the value stored in the node with the given key. Returns `Nothing` if the key is not found. - `deleteAVL :: Int -> AVL a -> AVL a`, delete a node with a given key. If the tree doesn't have a node with that key, it just returns the original tree. For the full externalist approach, the interface is: - `EmptyITree :: ITree 'EmptyTree`, an empty ITree. - `insert :: Node x a -> ITree t -> ITree (Insert x a t)`, inserts a node in a ITree. If the tree already has a node with key `x`, the value is updated. - `lookup :: (t ~ 'ForkTree l (Node n a1) r, Member x t ~ 'True) => Proxy x -> ITree t -> a`, lookup a key in a ITree. The constraint `t ~ 'ForkTree l (Node n a1) r` checks at compile time if the tree is not empty; the constraint `Member x t ~ 'True` checks at compile time that the tree `t` has a node with key `x` (ensuring that the lookup will return some value). - `delete :: Proxy x -> ITree t -> (Delete x t)`, delete a node with a given key in a ITree. If the tree doesn't have a node with key `x`, it just returns the original tree. - `AVL :: ITree t -> IsBSTT t -> IsBalancedT t -> AVL t`, constructor for type-safe AVL trees. - `mkAVL :: (IsBSTC t, IsBalancedC t) => ITree t -> AVL t`, constructor for type-safe AVL from an ITree, with automatic proof term construction. For the externalist and internalist approaches, the interface is the same and is as follows: - `emptyAVL :: AVL 'EmptyTree`, an empty AVL tree. - `insertAVL :: Proxy x -> a -> AVL t -> AVL (Insert x a t)`, inserts a value of type `a` with key `x` in an AVL of type `AVL t`. If the tree already has a node with key `x`, the value is updated. - `lookupAVL :: (t ~ 'ForkTree l (Node n a1) r, Member x t ~ 'True) => Proxy x -> AVL t -> a`, returns the value, of type `a`, that is associated to the key `x` in the AVL tree of type `AVL t`. The constraint `t ~ 'ForkTree l (Node n a1) r` checks at compile time if the tree is not empty; the constraint `Member x t ~ 'True` checks at compile time that the tree `t` has a node with key `x` (ensuring that the lookup will return some value). - `deleteAVL :: Proxy x -> AVL t -> AVL (Delete x t)`, deletes the node with key `x` in an AVL of type `AVL t`. If the tree doesn't have a node with key `x`, it just returns the original tree. ### BST trees The interfaces are analogous to those for AVL trees. Just replace "AVL" for "BST" in the functions and modules names. ## Examples For more usage examples see the `Examples.hs` file for each approach. ### Unsafe ```haskell import Data.Tree.AVL.Unsafe (deleteAVL, emptyAVL, insertAVL, lookupAVL) import Prelude (Int, Float, Bool(True,False)) -- | Test Balanced Binary Tree avle = emptyAVL avlt1 = insertAVL 20 'f' avle avlt2 = insertAVL 60 'o' avlt1 avlt3 = insertAVL 30 'l' avlt2 -- | Just 'l' l1 = lookupAVL 30 avlt3 -- | Nothing: key 10 is not in the tree avlt8 n = lookupAVL 10 avlt3 avlt4 = deleteAVL 20 avlt3 avlt5 = deleteAVL 60 avlt3 -- | There's no error. Just return the same tree avlt6 = deleteAVL 40 avlt3 ``` ### Full Extern A full externalist approach means grouping the operations over `ITree`, only constructing the BST/AVL and verifying the invariants at the end: ```haskell {-# LANGUAGE DataKinds #-} {-# LANGUAGE TypeFamilies #-} import Data.Proxy (Proxy (Proxy)) import Data.Tree.AVL.FullExtern (AVL (AVL), mkAVL, ITree (EmptyITree), delete, insert, lookup) import Data.Tree.ITree (ITree(EmptyITree,ForkITree)) import Data.Tree.Node (mkNode) import Prelude (Bool (False, True), Float, Int, ($)) -- | Proxies for the node keys p0 = Proxy :: Proxy 0 p1 = Proxy :: Proxy 1 p2 = Proxy :: Proxy 2 p3 = Proxy :: Proxy 3 p4 = Proxy :: Proxy 4 p5 = Proxy :: Proxy 5 p6 = Proxy :: Proxy 6 p7 = Proxy :: Proxy 7 -- | Insert several values in a row and check the BST and AVL invariants at the end avl = mkAVL t where t = insert (mkNode p4 'f') $ insert (mkNode p2 (4::Int)) $ insert (mkNode p6 "lala") $ insert (mkNode p3 True) $ insert (mkNode p5 ([1,2,3]::[Int])) $ insert (mkNode p0 (1.8::Float)) $ insert (mkNode p7 [False]) EmptyITree -- | For performing a lookup, it's necessary to take the ITree 't' out of the AVL constructor l1' = case avl of AVL t _ _ -> lookup (Proxy::Proxy 3) t -- | Compile time error: key 1 is not in the tree avl and left subtree at node with key 4 has height +2 greater than the right subtree -- avlError = mkAVL $ -- ForkITree (ForkITree -- (ForkITree -- EmptyITree (mkNode p0 'a') EmptyITree) -- (mkNode p7 4) -- EmptyITree) -- (mkNode p4 'f') -- EmptyITree -- | Delete several values in a row and check the BST and AVL invariants at the end avlt2 = case avl of AVL t _ _ -> mkAVL t' where t' = delete p7 $ delete p4 $ delete p1 $ delete p0 $ delete p2 $ delete p6 $ delete p5 $ delete p3 t ``` ### Extern and Intern In the externalist and internalist approaches, the invariants are checked after every operation performed over the tree. Some examples for the externalist approach: ```haskell {-# LANGUAGE DataKinds #-} import Data.Proxy (Proxy (Proxy)) import Data.Tree.AVL.Extern (deleteAVL, emptyAVL, insertAVL, lookupAVL) import Prelude (Bool (False, True), Float, Int, String) -- | Proxies for the node keys p2 = Proxy :: Proxy 2 p4 = Proxy :: Proxy 4 p7 = Proxy :: Proxy 7 -- Insert value 'f' with key 4 avlt1 = insertAVL p4 'f' emptyAVL -- Insert value [1,2] with key 2 avlt2 = insertAVL p2 [1,2] avlt1 -- list = [1,2] list = lookupAVL p2 avlt2 -- | Error: key 7 is not in the tree avlt2 -- err = lookupAVL p7 avlt2 -- Delete node with key 4 avlt3 = deleteAVL p4 avlt2 -- Delete node with key 2. Returns the empty AVL avlt4 = deleteAVL p2 avlt3 ``` For using the internalist approach, the code example for the externalist approach works with the only difference in the import list: ```haskell import Data.Tree.AVL.Intern (emptyAVL,insertAVL,lookupAVL,deleteAVL) -- Instead of import Data.Tree.AVL.Extern (emptyAVL,insertAVL,lookupAVL,deleteAVL) ``` ### Examples: BST trees The previous examples used AVL trees. For using AVL trees just replace the imported modules along with the functions: ```Shell mkAVL -> mkBST emptyAVL -> emptyBST insertAVL -> insertBST lookupAVL -> lookupBST deleteAVL -> deleteBST ``` Notice that operations for BST may only be applied to BST trees, and operations for AVL trees may only be applied for AVL trees. For instance, this is not possible if `bst2` is not an AVL tree: ```haskell deleteAVL (Proxy::Proxy 5) bst2 ``` It gives an error at compile time. ## Benchmark ### Structure ```Shell balanced-binary-search-tree │ README.md └───benchmark └───AVL │ └───FullExtern | | | Benchmark.hs │ │ │ | | └───Insert │ │ │ | | └───Lookup │ │ │ | | └───Delete │ └───Extern │ │ └── ... │ └───Intern │ │ └── ... │ └───Unsafe │ └── ... └───BST │ └── ... ``` There are benchmarks for both BST and AVL trees for each approach (clone the repository for viewing or using the benchmarks). For instance, in the folder `benchmark/AVL/FullExtern/` there are three folders and one source file: `Insert/`, `Lookup/`, `Delete/` and `Benchmark.hs`. Inside each folder there are different source files for benchmarking each operation under several tree sizes; they're split in different files in order to be able to measure not only the running times but also the compile times. The source files `Benchmark.hs` performs all of the running time benchmarks defined inside the folders `Insert/`, `Lookup/` and `Delete/`. ### Running the benchmark Given the compilation effort for running the benchmarks, the Hackage package comes without them. In order to run the benchmarks, you should clone the repository and switch to `benchmark` branch. For running all the benchmarks, use `benchmark/avg-bench.py` script. There's a `requirements.txt` file with all the python dependencies needed to run the script. If you're using `pip`, just run `pip install -r requirements.txt`.