HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Improving General Algebraic DataType type-safe code for AA trees

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
datatypegeneraltypetreesforsafecodealgebraicimproving

Problem

As an exercise on GADTs I wrote a type-safe implementation of AA trees. I'm quite happy that the AANode data type correctly grasps the properties of the tree. Just from the type it can be concluded that the trees has the balancing property. But there I things I'd like to improve, but I didn't find a nice solution for it:

  • I need separate data type for the result of insert'. I'm afraid nothing can be done about that, as the algorithm really needs to finely distinguish the possible outcomes.



  • For inserting a new value into subtrees with a black root, I need a separate function insertBlack. The reason is that in this case, the tree level cannot rise, and it is necessary to capture this property. I'd be nicer to have it all in insert', but I'm afraid this would require dependent types (as the result type would depend on the color of the input).



  • I need a separate data type for the result of insertBlack. A solution would be to write the type of the result as an existential, like ... -> (exists c . AANode c (Succ n) a), but AFAIK GHC doesn't support that.



  • Even though the AA trees are quite simple, I feel that I have a lot of pattern matching for all the different case. At the end, I have 8 possibilities. Maybe they really need to be there, as for each node, we have to distinguish if we insert into the left or the right subtree, we need to distinguish its color, and also we need to match on the recursive result to correctly update the tree.



  • Perhaps, there is a better way how to capture the required properties of the trees? To construct the type of AANode differently?



Any other comments on how to make the code shorter, more readable or more beautiful are welcomed.

The code is below:

`{-# LANGUAGE GADTs #-}
{-# OPTIONS_GHC -fwarn-incomplete-patterns #-}

-- Based on https://en.wikipedia.org/wiki/AA_tree
-- If it happens that anyone ever wants to use this code for something,
-- it's licensed under The BSD 3-Clause License.

import qualified Data.F

Solution

As suggested, I'm doing a review myself after a few years.

First, let's use GHC's extensions to derive the phantom types from data definitions.
This also has the advantage of having proper kinds for each of these types.

{-# LANGUAGE GADTs, DataKinds, KindSignatures, TypeFamilies #-}
{-# OPTIONS_GHC -fwarn-incomplete-patterns #-}

-- |
-- Module: AATree
-- Copyright: Petr Pudlak
-- License: BSD-3
--
-- Based on https://en.wikipedia.org/wiki/AA_tree
module AATree where

import qualified Data.Foldable as F
import Data.Monoid

-- * Internals of the tree with type-level colors and depths.

-- | Types representing natural numbers:
data Nat = Zero | Succ Nat

-- | Phantom types for possible colors of nodes.
data Color = Red | Black


The definition of the main data type is basically unchanged.
One important change here is the color of Leaf. Originally it was black, but actually it needs to be red in order to have consistent color change in the insert operation, see SomeOrNext below.

-- | Every child node has a level -1 its parent node, except
-- the right node of a red node. Nodes have 3 type parameters:
-- Color (phantom), level (phantom), values.
data AANode (c :: Color) (n :: Nat) a where
    Leaf :: AANode 'Red 'Zero a
    -- | Red nodes can have the right child with the same level, but then
    -- the child must be black. The left child must have -1 of its level.
    BlackNode :: !(AANode c1 n a) -> a -> !(AANode c2 n a) -> AANode 'Black ('Succ n) a
    -- | Both children of a black node must have -1 of its level.
    -- They can have arbitrary colors.
    RedNode :: !(AANode c1 n a) -> a -> !(AANode Black ('Succ n) a) -> AANode 'Red ('Succ n) a

-- ** Two operations to fix improper tree shapes (see the Wikipedia article).

-- | Left node grows too large.
skew :: AANode 'Black ('Succ n) a -> a -> AANode c n a -> AANode 'Red ('Succ n) a
skew (BlackNode a x b) y r = RedNode a x (BlackNode b y r)

-- | Right node grows too large.
split :: AANode c n a -> a -> AANode 'Red (Succ n) a -> AANode 'Black (Succ (Succ n)) a
split a t (RedNode b r x) = BlackNode (BlackNode a t b) r x

-- ** Insert operation


Instead of two types, Outcome and AnyColor, which were holding outputs of the different insert operations, let's have just one, parametrized by Color.

-- | Holds the outcome of an insert operation on a node of color 'c'.
-- Inserting into a black node retains its depth and possibly chcanges it
-- to red ('Same'). Inserting into a red one either keeps it the same
-- ('Same') or changes it to a black node of increased depth ('Next').
data SameOrNext (c :: Color) (n :: Nat) a where
    Same :: AANode c_any n a -> SameOrNext c n a
    Next :: AANode 'Black ('Succ n) a -> SameOrNext 'Red n a


This allows us to have just a single insert function with a simple recursive definition.
There are still 8 cases to cover, but it seems this can't be simplified, at least not without a considerable sacrifice of code readability. In particular, we need to branch whether we move to the left or right, distinguish inserting into a red/black node, and act depending on the outcome of a recursive insert.

insert' :: (Ord a) => a -> AANode c n a -> SameOrNext c n a
insert' x Leaf = Next (BlackNode Leaf x Leaf)
insert' x (BlackNode l y r)
    | x  Same (BlackNode l' y r)
                    Next l' -> Same (skew l' y r)
    | otherwise = case insert' x r of
                    Same r' -> Same (BlackNode l y r')
                    Next r' -> Same (RedNode l y r')
insert' x (RedNode l y r)
    | x  Same (RedNode l' y r)
                    Next l' -> Next (BlackNode l' y r)
    | otherwise = case insert' x r of
                    Same r'@(BlackNode _ _ _) -> Same (RedNode l y r')
                    Same r'@(RedNode _ _ _) -> Next (split l y r')

-- * Public interface.

-- | The actual tree type hiding colors/levels.
data AATree a where
    AATree :: !(AANode c n a) -> AATree a


Minor change in the Foldable definition: Instead of using foldr, it's more natural to define it using foldMap, especially for trees. This can also have significant performance benefits in certain cases, like when using the Last monoid (untested).

instance F.Foldable (AANode c n) where
    foldMap f Leaf = mempty
    foldMap f (RedNode l y r) = F.foldMap f l <> f y <> F.foldMap f r
    foldMap f (BlackNode l y r) = F.foldMap f l <> f y <> F.foldMap f r

instance F.Foldable AATree where
    foldMap f (AATree t) = F.foldMap f t

insert :: (Ord a) => a -> AATree a -> AATree a
insert x (AATree t) = case insert' x t of
                                Same t -> AATree t
                                Next t -> AATree t

empty :: AATree a
empty = AATree Leaf

-- * Very simple test.

main :: IO ()
main = do
    let fromList :: Ord a => [a] -> AATree a
        fromList = foldr insert empty

    print . F.toList . fromList $ "ABCD Kocka prede."


I also changed comments to conform to proper the Haddock documenation

Code Snippets

{-# LANGUAGE GADTs, DataKinds, KindSignatures, TypeFamilies #-}
{-# OPTIONS_GHC -fwarn-incomplete-patterns #-}

-- |
-- Module: AATree
-- Copyright: Petr Pudlak
-- License: BSD-3
--
-- Based on https://en.wikipedia.org/wiki/AA_tree
module AATree where

import qualified Data.Foldable as F
import Data.Monoid

-- * Internals of the tree with type-level colors and depths.

-- | Types representing natural numbers:
data Nat = Zero | Succ Nat

-- | Phantom types for possible colors of nodes.
data Color = Red | Black
-- | Every child node has a level -1 its parent node, except
-- the right node of a red node. Nodes have 3 type parameters:
-- Color (phantom), level (phantom), values.
data AANode (c :: Color) (n :: Nat) a where
    Leaf :: AANode 'Red 'Zero a
    -- | Red nodes can have the right child with the same level, but then
    -- the child must be black. The left child must have -1 of its level.
    BlackNode :: !(AANode c1 n a) -> a -> !(AANode c2 n a) -> AANode 'Black ('Succ n) a
    -- | Both children of a black node must have -1 of its level.
    -- They can have arbitrary colors.
    RedNode :: !(AANode c1 n a) -> a -> !(AANode Black ('Succ n) a) -> AANode 'Red ('Succ n) a

-- ** Two operations to fix improper tree shapes (see the Wikipedia article).

-- | Left node grows too large.
skew :: AANode 'Black ('Succ n) a -> a -> AANode c n a -> AANode 'Red ('Succ n) a
skew (BlackNode a x b) y r = RedNode a x (BlackNode b y r)

-- | Right node grows too large.
split :: AANode c n a -> a -> AANode 'Red (Succ n) a -> AANode 'Black (Succ (Succ n)) a
split a t (RedNode b r x) = BlackNode (BlackNode a t b) r x

-- ** Insert operation
-- | Holds the outcome of an insert operation on a node of color 'c'.
-- Inserting into a black node retains its depth and possibly chcanges it
-- to red ('Same'). Inserting into a red one either keeps it the same
-- ('Same') or changes it to a black node of increased depth ('Next').
data SameOrNext (c :: Color) (n :: Nat) a where
    Same :: AANode c_any n a -> SameOrNext c n a
    Next :: AANode 'Black ('Succ n) a -> SameOrNext 'Red n a
insert' :: (Ord a) => a -> AANode c n a -> SameOrNext c n a
insert' x Leaf = Next (BlackNode Leaf x Leaf)
insert' x (BlackNode l y r)
    | x < y     = case insert' x l of
                    Same l' -> Same (BlackNode l' y r)
                    Next l' -> Same (skew l' y r)
    | otherwise = case insert' x r of
                    Same r' -> Same (BlackNode l y r')
                    Next r' -> Same (RedNode l y r')
insert' x (RedNode l y r)
    | x < y     = case insert' x l of
                    Same l' -> Same (RedNode l' y r)
                    Next l' -> Next (BlackNode l' y r)
    | otherwise = case insert' x r of
                    Same r'@(BlackNode _ _ _) -> Same (RedNode l y r')
                    Same r'@(RedNode _ _ _) -> Next (split l y r')

-- * Public interface.

-- | The actual tree type hiding colors/levels.
data AATree a where
    AATree :: !(AANode c n a) -> AATree a
instance F.Foldable (AANode c n) where
    foldMap f Leaf = mempty
    foldMap f (RedNode l y r) = F.foldMap f l <> f y <> F.foldMap f r
    foldMap f (BlackNode l y r) = F.foldMap f l <> f y <> F.foldMap f r

instance F.Foldable AATree where
    foldMap f (AATree t) = F.foldMap f t

insert :: (Ord a) => a -> AATree a -> AATree a
insert x (AATree t) = case insert' x t of
                                Same t -> AATree t
                                Next t -> AATree t

empty :: AATree a
empty = AATree Leaf


-- * Very simple test.

main :: IO ()
main = do
    let fromList :: Ord a => [a] -> AATree a
        fromList = foldr insert empty

    print . F.toList . fromList $ "ABCD Kocka prede."

Context

StackExchange Code Review Q#15017, answer score: 4

Revisions (0)

No revisions yet.