HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Searching files in directory with various filters

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
directorysearchingwithfilesfiltersvarious

Problem

Requirements

Note: the requirements are invented by me for practicing functional programming.

Functional Requirements

Given a directory, all files below the directory (and its sub directories) should be filtered by one of the available filter conditions and then printed to the console.

The filter condition can be

a) a filter function based on a FileInfo object

  • If the file passes the filter, the full path should be printed to console



b) a Regex that tries to match the content of the file.

  • Binary files should be always filtered out



  • If the file passes the filter, the full path should be printed to the console



  • For each matching line, the line number and the full line should be printed to the console



Technical Requirements

  • the solution should be as functional as possible



  • the solution should work without mutable state



  • For performance reasons, the whole program should work lazy (we don't want to create a large list of all files and it's state and finally working on that data structure)



Solution

Usage

[]
let main argv = 

    let path = @"C:\Temp\CommandLineFSharp"

    // regex file content fiter
    let regex = Regex("Regex", RegexOptions.Compiled ||| RegexOptions.IgnoreCase)
    let title = "All files that match the regex 'Regex'"
    let allFilesContainingError = FileContentRegexFilter(title, regex)
    findIn path allFilesContainingError

    // FileInfo filter
    let title = "All files whose name start with 'A'"
    let allFilesStartingWithA = FileInfoFilter(title, fun fi -> fi.Name.StartsWith("a", StringComparison.InvariantCultureIgnoreCase))
    findIn path allFilesStartingWithA

    Console.ReadLine() |> ignore
    0


Output

```
/*--------------------------------------------------------------------------------
All files that match the regex 'Regex'
--------------------------------------------------------------------------------
C:\Temp\CommandLineFSharp\FileSystem.fs
[line: 18] | FileContentRegexFilter of St

Solution

I meant to answer this a while back, but I completely forgot. At any rate, let's talk about a couple things.

First, try not to use a pattern match if there's no pattern. In this case you have:

match reader with
        | r when r.EndOfStream -> true
        | r when r.Read() |> char |> isControlChar -> false
        | _ -> processReader reader


When

if reader.EndOfStream = true then true
elif reader.Read() |> char |> isControlChar = true then false
else processReader reader


Will do just fine, and in fact may be easier to read for some (YMMV).

This next block looks like it violates SRP a lot:

let findIn directoryPath filter =
    match filter with
        | FileInfoFilter(title, _) | FileContentRegexFilter(title, _) ->
            printLine ()
            printfn @" %s" title
            printLine ()                

    let rec findInternal fileSystemItem =
        match fileSystemItem with
            | File(fi) -> processFilter filter fi
            | Directory(di, subs) -> subs |> Seq.iter findInternal 

    findInternal (directoryPath |> (DirectoryInfo >> createDirectory))

    printLine ()
    printfn ""


Why does a findIn have to print everything it found? Why can't it just return a sequence of the results?

Let's talk about the specific concerns you have.


I've tried to be as descriptive as possible. However, F# provides lots of ways of doing the same (e.g. (func val) or (val |> func)). Any suggestion for further improvements / are there any fragments that are hard to understand?

Generally, it doesn't matter which style you use, though using the pipe-right (or pipe-next) operator has a better readability. Consider the following snippet:

let value2 =
    values
    |> Array.toList
    |> List.map (fun v -> Math.Sin(float v))
    |> List.choose (fun v -> if v >= 0.0 then Some v else None)
    |> List.rev
    |> List.takeWhile (fun v -> v > 0.05)
    |> List.fold (fun v acc -> acc + v) 0.0


Versus:

let value2 =
    List.fold (fun v acc -> acc + v) 0.0
        (List.takeWhile (fun v -> v > 0.05)
        (List.rev
        (List.choose (fun v -> if v >= 0.0 then Some v else None)
        (List.map (fun v -> Math.Sin(float v))
        (Array.toList values)))))


I shouldn't have to tell you which one I would prefer. Just look at all those parenthesis.

F# also contains a double-pipe-right and triple-pipe-right operator, which will pipe a tuple to the function. (I'm not sure if you were already aware of that or not.) So you can write something like:

let append s1 s2 = s1 + "." + s2
let result = ("abc", "def") ||> append


And regarding readability, keep your indentation consistent:

let private getMatchingLines (r:Regex) (file:FileInfo) = 
        if (file |> isTextFile) then
            file 
            |> enumerateLines 
            |> Seq.mapi (fun idx line -> (line, idx+1))
            |> Seq.filter (fst >> r.IsMatch)    
        else
            Seq.empty


Let's bring that back a few spaces:

let private getMatchingLines (r:Regex) (file:FileInfo) = 
    if (file |> isTextFile) then
        file 
        |> enumerateLines 
        |> Seq.mapi (fun idx line -> (line, idx+1))
        |> Seq.filter (fst >> r.IsMatch)    
    else
        Seq.empty



Is it possible to simplify some of the code fragments?

You didn't define 'simplify', but let's consider fewer LoC. You don't really have many places (without a major structural rewrite) that you can reduce LoC. Your code is currently good and it's also concise. Everything (almost) follows SRP well, and it's easy to reason about for the most part.

You could make things a bit easier to reason about by considering places where you use multiple compositions in opposite directions, such as matchingLines |> (not Seq.isEmpty |> not. Now that doesn't read to a human as simply, I can see why you wrote it the way you did, but if we follow the mantra of imagining the next maintainer is a psychopath with a sword who knows where you live, and has a very low breaking point, this probably isn't the best way to write it.


That is the point I am most interested in. Actually, the code works fine for the functional requirements mentioned above. But as far as new (even small) requirements come up, it feels that large parts of the program must be rewritten.


...


Is there a way (or something like best practices similar to SOLID for OOP) for designing / organizing functional code to become more extendable?

You've discovered one of the major downfalls of FP: it's built to just work, and it's generally not as extensible as one would like. One of the major disadvantages of immutability and designing for it is that to add something that, in a language such as C# would be so simple, you usually end up making major changes.

However, this doesn't have to be the case. You could reduce the impact with memoization, function composition and a basic object / type / class to contain all your wo

Code Snippets

match reader with
        | r when r.EndOfStream -> true
        | r when r.Read() |> char |> isControlChar -> false
        | _ -> processReader reader
if reader.EndOfStream = true then true
elif reader.Read() |> char |> isControlChar = true then false
else processReader reader
let findIn directoryPath filter =
    match filter with
        | FileInfoFilter(title, _) | FileContentRegexFilter(title, _) ->
            printLine ()
            printfn @" %s" title
            printLine ()                

    let rec findInternal fileSystemItem =
        match fileSystemItem with
            | File(fi) -> processFilter filter fi
            | Directory(di, subs) -> subs |> Seq.iter findInternal 

    findInternal (directoryPath |> (DirectoryInfo >> createDirectory))

    printLine ()
    printfn ""
let value2 =
    values
    |> Array.toList
    |> List.map (fun v -> Math.Sin(float v))
    |> List.choose (fun v -> if v >= 0.0 then Some v else None)
    |> List.rev
    |> List.takeWhile (fun v -> v > 0.05)
    |> List.fold (fun v acc -> acc + v) 0.0
let value2 =
    List.fold (fun v acc -> acc + v) 0.0
        (List.takeWhile (fun v -> v > 0.05)
        (List.rev
        (List.choose (fun v -> if v >= 0.0 then Some v else None)
        (List.map (fun v -> Math.Sin(float v))
        (Array.toList values)))))

Context

StackExchange Code Review Q#161332, answer score: 2

Revisions (0)

No revisions yet.