patternMinor
Implementation of Set data type in Haskell
Viewed 0 times
typehaskellimplementationdataset
Problem
I'm an advanced beginner Haskell programmer. I implemented a set data type in Haskell and I wanted to find out if I made correct choices when writing it. Would there be a way to improve some of the algorithms like intersection etc.
data SET a = E | Add(a, SET a) deriving (Eq, Ord, Read, Show)
--test data
setx = listToSet[1,2,3,4]
setx2 = listToSet[3,4,2,1]
sety = listToSet[3,4,5,6]
setys = listToSet[3,4]
listToSet :: [a] -> SET a
listToSet [] = E
listToSet (s:x) = Add(s, listToSet x)
inS :: (Eq a) => a -> SET a -> Bool
inS a E = False
inS a (Add(value, set)) = if (value == a)
then True
else (inS a set)
nullSET :: (Eq a) => SET a -> Bool
nullSET s = s == E
headSET :: SET a -> a
headSET (Add(v, s)) = v
tailSET :: SET a -> SET a
tailSET (Add(v, s)) = s
addToSet :: (Eq a) => a -> SET a -> SET a
addToSet a (Add(value, set)) = if (inS a (Add(value, set)) == False)
then Add(a, Add(value, set))
else (Add(value, set))
setFilter :: (Eq a) => (a -> Bool) -> SET a -> SET a
setFilter f s
| nullSET s = E
| f h == False = setFilter f t
| otherwise = Add(h, setFilter f t)
where
h = headSET s
t = tailSET s
intersection :: (Eq a) => SET a -> SET a -> SET a
intersection s1 s2 = setFilter (\x -> inS x s1) s2
union :: (Eq a) => SET a -> SET a -> SET a
union set E = set
union set (Add(value2, set2)) = if (inS value2 set == False)
then Add(value2, (union set set2))
else union set set2
subSet :: (Eq a) => SET a -> SET a -> Bool
subSet set1 set2
| set1 == E = True
| inS h set2 = subSet t set2
| otherwise = False
where
h = headSET set1
t = tailSET set1
setEq :: (Eq a) => SET a -> SET a -> Bool
setEq s1 s2 = (subSet s1 s2) == (subSet s2 s1)Solution
module Set whereSelf-contained reusable chunks of code should be organized into modules.
data Set a = Empty | Set a (Set a)
deriving (Read, Show)All-caps types are not a thing, just
Set is fine. Single letter constructors are uncommon too, so instead of E use a meaningful name like Empty. Constructors can take multiple arguments, so don't use tuples where you don't have to. I'll also point out that this data type you've defined is isomorphic to [a]. Besides the magic square brackets, Haskell lists are a data type like any other, i.e. data [a] = [] | (:) a [a]. Notice the similarity? Is this a situation where you need a new datatype or might it make sense to use a newtype?-- Test data
setx, setx2, sety, setys :: Set Int
setx = listToSet [1,2,3,4]All top-level bindings should be given a type signature, get into the habit now and you'll save yourself the frustration of a lot of cryptic misplaced error messages later. These names are all over the place too, they should be more consistent, e.g.
exampleSetA, exampleSetB, exampleSetC. Don't crowd together functions and their arguments, leave some space and let it breathe.fromList :: [a] -> Set a
fromList [] = Empty
fromList (x:xs) = Set x (fromList xs)fromList is a common name for converting lists to arbitrary datatypes. When it conflicts with other names from modules you have imported, you disambiguate with qualified imports, not a menagerie of globally unique names. (x:xs) is a common naming pattern for lists, that is, some variable x, y, e, or whatever, and then the plural form (xs, ys, es... it's a pun, get it?). Again, space out function application, tuples are a datatype in Haskell, not a method of passing arguments.member :: (Eq a) => a -> Set a -> Bool
member _ Empty = False
member a (Set x xs) | a == x = True
| otherwise = member a xsA common function name for testing set membership is
member. When you don't use an argument name in the body of a function, replace it with _. If you compile with -Wall (as you should), GHC will warn you about unused bindings. That is, names that you defined in a function body but didn't use. Often this can help you to notice errors before you ever get to running your code! Use guards (e.g., | a == x) instead of if statements where you can. Guards are like a multiway if, when you have multiple cases it becomes much clearer than chaining ifs, get into the habit now, there's no reason not to.null :: Set a -> Bool
null Empty = True
null _ = FalseUse pattern matching where you can, it's more explicit than equality testing, often faster, and helps you avoid unnecessary constraints. Notice how my version lacks the
(Eq a) constraint?-- Mourn headSet and tailSet hereThese functions are meaningless on
Sets. Being unordered collections, Sets do not have meaningful heads or tails, so you should not leak implementation details by exposing them.insert :: (Eq a) => a -> Set a -> Set a
filter :: (a -> Bool) -> Set a -> Set a
intersection :: (Eq a) => Set a -> Set a -> Set a
union :: (Eq a) => Set a -> Set a -> Set a
isSubset :: (Eq a) => Set a -> Set a -> BoolAround now you're starting to feel the pain of your choice of
Set representation. Implementations are slow and unwieldy. For instance, your version of intersection is an \$O(n^2)\$ operation. You can improve running times by utilizing any of a number of standard algorithms, try using balanced binary trees to represent your sets instead of linked lists.instance (Eq a) => Eq (Set a) whereInstead of making up another name, write your own instances for standard functions like equality. If the derived instance doesn't do what you want, don't use it, don't keep it around (because you'll be unable to keep it from being exported to users of your library), and don't let it clutter up good namespace.
Code Snippets
module Set wheredata Set a = Empty | Set a (Set a)
deriving (Read, Show)-- Test data
setx, setx2, sety, setys :: Set Int
setx = listToSet [1,2,3,4]fromList :: [a] -> Set a
fromList [] = Empty
fromList (x:xs) = Set x (fromList xs)member :: (Eq a) => a -> Set a -> Bool
member _ Empty = False
member a (Set x xs) | a == x = True
| otherwise = member a xsContext
StackExchange Code Review Q#97896, answer score: 3
Revisions (0)
No revisions yet.