patternMinor
Is there a better way to walk a directory tree?
Viewed 0 times
directorywaybetterwalktheretree
Problem
I just started to learn Haskell, and i wrote a function that walks the directory tree recursively and pass the content of each directory to a callback function:
So here is the type of the callback function:
And here is the type of my
And here is my complete code:
`module Test where
import System.FilePath (())
import Control.Monad (filterM, forM_, return)
import System.IO.Error (tryIOError, IOError)
import System.Directory (getDirectoryContents, doesFileExist, doesDirectoryExist)
myVisitor :: FilePath -> Either IOError ([FilePath], [FilePath]) -> IO ()
myVisitor path result = do
case result of
Left error -> do
putStrLn $ "I've tryed to look in " ++ path ++ "."
putStrLn $ "\tThere was an error: "
putStrLn $ "\t\t" ++ (show error)
Right (dirs, files) -> do
putStrLn $ "I've looked in " ++ path ++ "."
putStrLn $ "\tThere was " ++ (show $ length dirs) ++ " directorie(s) and " ++ (show $ length files) ++ " file(s):"
forM_ (dirs ++ files) (\x -> putStrLn $ "\t\t- " ++ x)
putStrLn ""
walk :: FilePath -> (FilePath -> Either IOError ([FilePath], [FilePath]) -> IO ()) -> IO ()
walk path visitor = do
result do
visitor path result
Right (dirs, files) -> do
visitor path result
forM_
(map (\x -> path x) dirs)
(\x -> walk x visitor)
where
listdir = do
entries >= filterHidden
subdirs entry)
isDir entry = doesDirectoryExist (p
- The content of the directory is a tuple, containing a list of all the sub-directories, and a list of file names.
- If there is an error, the error is passed to the callback function instead of the directory content (it use
Either).
So here is the type of the callback function:
callback :: FileName -> Either IOError ([FileName], [FileName]) -> IO ()
And here is the type of my
walk function:walk :: FilePath -> (FilePath -> Either IOError ([FilePath], [FilePath]) -> IO ()) -> IO ()
And here is my complete code:
`module Test where
import System.FilePath (())
import Control.Monad (filterM, forM_, return)
import System.IO.Error (tryIOError, IOError)
import System.Directory (getDirectoryContents, doesFileExist, doesDirectoryExist)
myVisitor :: FilePath -> Either IOError ([FilePath], [FilePath]) -> IO ()
myVisitor path result = do
case result of
Left error -> do
putStrLn $ "I've tryed to look in " ++ path ++ "."
putStrLn $ "\tThere was an error: "
putStrLn $ "\t\t" ++ (show error)
Right (dirs, files) -> do
putStrLn $ "I've looked in " ++ path ++ "."
putStrLn $ "\tThere was " ++ (show $ length dirs) ++ " directorie(s) and " ++ (show $ length files) ++ " file(s):"
forM_ (dirs ++ files) (\x -> putStrLn $ "\t\t- " ++ x)
putStrLn ""
walk :: FilePath -> (FilePath -> Either IOError ([FilePath], [FilePath]) -> IO ()) -> IO ()
walk path visitor = do
result do
visitor path result
Right (dirs, files) -> do
visitor path result
forM_
(map (\x -> path x) dirs)
(\x -> walk x visitor)
where
listdir = do
entries >= filterHidden
subdirs entry)
isDir entry = doesDirectoryExist (p
Solution
You might be interested in the concept of iteratees/pipes that can be used to solve this problem. It allows you to separate producing the tree and consuming it somewhere else without direct callback functions: You create a producer that enumerates the directory tree, perhaps some filters that modify the data and a separate consumer that works on the data. And then compose and run the whole pipeline. See also Streaming recursive descent of a directory in Haskell.
There are also specialized Haskell packages for that such as directory-tree, which you can use or study.
Edit: As an example I reworked your code using conduit. This library (and others based on the same principle) has several advantages, namely:
Some suggestions:
There are also specialized Haskell packages for that such as directory-tree, which you can use or study.
Edit: As an example I reworked your code using conduit. This library (and others based on the same principle) has several advantages, namely:
- It separates the producer of the data with its consumer.
- In a source you just call
yieldwhen you want to send a piece of data to the pipe.
- In a sink you call
awaitwhenever you want to receive a piece of data (here we used more specializedawaitForever).
- You can have conduits that sit in the middle and consume and produce values at the same time. They can do whatever processing on the stream, mixing calls to
yieldandawaitas they wish.
- This allows you to create complex computation where the behavior of your components depends on data sent/deceived earlier. We use this in our source (traversing a directory tree), and I also added it to the sink (visitor) - it keeps track of how many directories it has been passed so far.
- Both source and sink (and intermediate conduits, if any) can have finalizers.
Some suggestions:
- Create your own data types instead of using combinations of
Eitherand(,). It makes your code shorter and easier to understand.
- Sometimes it's worth declaring new functions instead of using complex
case ... ofexpressions. It can make code easier to read.
- hlint can suggest how to (syntactically) improve a piece of code.
import System.FilePath (())
import Control.Monad (filterM, forM_, return)
import System.IO.Error (tryIOError, IOError)
import System.Directory (getDirectoryContents, doesFileExist, doesDirectoryExist)
import Control.Monad.Trans.Class (lift)
import Data.Conduit
data DirContent = DirList [FilePath] [FilePath]
| DirError IOError
data DirData = DirData FilePath DirContent
-- Produces directory data
walk :: FilePath -> Source IO DirData
walk path = do
result do
yield (DirData path dl)
forM_ subdirs (walk . (path ))
Left error
-> yield (DirData path (DirError error))
where
listdir = do
entries >= filterHidden
subdirs entry)
isDir entry = doesDirectoryExist (path entry)
filterHidden paths = return $ filter (\path -> head path /= '.') paths
-- Consume directories
myVisitor :: Sink DirData IO ()
myVisitor = addCleanup (\_ -> putStrLn "Finished.") $ loop 1
where
loop n = do
lift $ putStrLn $ ">> " ++ show n ++ ". directory visited:"
r return ()
Just r -> lift (process r) >> loop (n + 1)
process (DirData path (DirError error)) = do
putStrLn $ "I've tried to look in " ++ path ++ "."
putStrLn $ "\tThere was an error: "
putStrLn $ "\t\t" ++ show error
process (DirData path (DirList dirs files)) = do
putStrLn $ "I've looked in " ++ path ++ "."
putStrLn $ "\tThere was " ++ show (length dirs) ++ " directorie(s) and " ++ show (length files) ++ " file(s):"
forM_ (dirs ++ files) (putStrLn . ("\t\t- " ++))
main :: IO ()
main = do
walk "/tmp" $ myVisitorCode Snippets
import System.FilePath ((</>))
import Control.Monad (filterM, forM_, return)
import System.IO.Error (tryIOError, IOError)
import System.Directory (getDirectoryContents, doesFileExist, doesDirectoryExist)
import Control.Monad.Trans.Class (lift)
import Data.Conduit
data DirContent = DirList [FilePath] [FilePath]
| DirError IOError
data DirData = DirData FilePath DirContent
-- Produces directory data
walk :: FilePath -> Source IO DirData
walk path = do
result <- lift $ tryIOError listdir
case result of
Right dl@(DirList subdirs files)
-> do
yield (DirData path dl)
forM_ subdirs (walk . (path </>))
Left error
-> yield (DirData path (DirError error))
where
listdir = do
entries <- getDirectoryContents path >>= filterHidden
subdirs <- filterM isDir entries
files <- filterM isFile entries
return $ DirList subdirs files
where
isFile entry = doesFileExist (path </> entry)
isDir entry = doesDirectoryExist (path </> entry)
filterHidden paths = return $ filter (\path -> head path /= '.') paths
-- Consume directories
myVisitor :: Sink DirData IO ()
myVisitor = addCleanup (\_ -> putStrLn "Finished.") $ loop 1
where
loop n = do
lift $ putStrLn $ ">> " ++ show n ++ ". directory visited:"
r <- await
case r of
Nothing -> return ()
Just r -> lift (process r) >> loop (n + 1)
process (DirData path (DirError error)) = do
putStrLn $ "I've tried to look in " ++ path ++ "."
putStrLn $ "\tThere was an error: "
putStrLn $ "\t\t" ++ show error
process (DirData path (DirList dirs files)) = do
putStrLn $ "I've looked in " ++ path ++ "."
putStrLn $ "\tThere was " ++ show (length dirs) ++ " directorie(s) and " ++ show (length files) ++ " file(s):"
forM_ (dirs ++ files) (putStrLn . ("\t\t- " ++))
main :: IO ()
main = do
walk "/tmp" $$ myVisitorContext
StackExchange Code Review Q#23231, answer score: 4
Revisions (0)
No revisions yet.