snippetMinor
F# program to generate a Map of substrings frequencies
Viewed 0 times
frequenciesmapsubstringsprogramgenerate
Problem
I wrote the following F# program that takes an input txt file and create a Map that contains all possible substrings of size N and their frequency in the file.
In other words if I call the program with a text file with the following content:
James John Max Gary Jess Gilles Mary
With a
Here is the program:
```
module NameGenerator
open System
open System.Collections.Generic
// Open a file and read all lines into IEnumerable
let readInputFile (filePath:string) =
System.IO.File.ReadAllText(filePath)
// Parses a string and count the total number of occurrences of substrings of size length
let rec countOccurrences (input:string) (occurrenceTable:Map) length =
let adjLen = length - 1
match input |> Seq.toList with
| head :: tail when tail.Length >= adjLen ->
let other = Seq.take adjLen tail |> Seq.toList
let occurrence = (head :: other |> Array.ofList |> String)
// add current occurrence to the occurrence table
let updatedMap = match occurrenceTable.ContainsKey (occurrence) with
| true -> occurrenceTable.Add(occurrence, occurrenceTable.[occurrence] + 1.0)
| false -> occurrenceTable.Add(occurrence, 1.0)
// call the function recursively with the rest of the string
countOccurrences (tail |> Array.ofList |> String) updatedMap length
| _ -> occurrenceTable
// Given a map that contains the a collection of string and their respective counts, return a
// frequency map that is obtained by count / total count
let buildFrenquencyTable (occurrenceTable:Map) =
// fold occurence dict, count total
let total = Map.fold (fun acc key value -> acc + value) 0.0 occurrenceTable
// map over previous map and replace values with count / total count
Map.map (fun key value -> Math.Round(value / total, 6)) occurrenceTable
// Given an input file cr
In other words if I call the program with a text file with the following content:
James John Max Gary Jess Gilles Mary
With a
length of 2, the probabilityTable contains a value with a key of "ar" with a value of 0.054054.Here is the program:
```
module NameGenerator
open System
open System.Collections.Generic
// Open a file and read all lines into IEnumerable
let readInputFile (filePath:string) =
System.IO.File.ReadAllText(filePath)
// Parses a string and count the total number of occurrences of substrings of size length
let rec countOccurrences (input:string) (occurrenceTable:Map) length =
let adjLen = length - 1
match input |> Seq.toList with
| head :: tail when tail.Length >= adjLen ->
let other = Seq.take adjLen tail |> Seq.toList
let occurrence = (head :: other |> Array.ofList |> String)
// add current occurrence to the occurrence table
let updatedMap = match occurrenceTable.ContainsKey (occurrence) with
| true -> occurrenceTable.Add(occurrence, occurrenceTable.[occurrence] + 1.0)
| false -> occurrenceTable.Add(occurrence, 1.0)
// call the function recursively with the rest of the string
countOccurrences (tail |> Array.ofList |> String) updatedMap length
| _ -> occurrenceTable
// Given a map that contains the a collection of string and their respective counts, return a
// frequency map that is obtained by count / total count
let buildFrenquencyTable (occurrenceTable:Map) =
// fold occurence dict, count total
let total = Map.fold (fun acc key value -> acc + value) 0.0 occurrenceTable
// map over previous map and replace values with count / total count
Map.map (fun key value -> Math.Round(value / total, 6)) occurrenceTable
// Given an input file cr
Solution
Fortunately you're already following the functional idioms entirely. So that's a very good thing.
The only things I would comment on are a few minor details:
Those parenthesis are superfluous.
In this case, you have a lot of whitespace to the left of the
It's generally easier to read.
I absolutely love to see
In regards to this method:
You nailed tail-call recursion. I would be willing to bet that if you looked at the IL for this code you would see a beautiful
Overall, beautiful work. I think you did a phenomenal job with this code. :)
The only things I would comment on are a few minor details:
let occurrence = (head :: other |> Array.ofList |> String)Those parenthesis are superfluous.
let updatedMap = match occurrenceTable.ContainsKey (occurrence) with
| true -> occurrenceTable.Add(occurrence, occurrenceTable.[occurrence] + 1.0)
| false -> occurrenceTable.Add(occurrence, 1.0)In this case, you have a lot of whitespace to the left of the
match conditions, usually in this case I just break the match to a new line and then work from there.let updatedMap =
match occurrenceTable.ContainsKey (occurrence) with
| true -> occurrenceTable.Add(occurrence, occurrenceTable.[occurrence] + 1.0)
| false -> occurrenceTable.Add(occurrence, 1.0)It's generally easier to read.
let total = Map.fold (fun acc key value -> acc + value) 0.0 occurrenceTableI absolutely love to see
fold used! I think it's far overlooked and definitely has many use cases.In regards to this method:
let rec countOccurrences (input:string) (occurrenceTable:Map) length =You nailed tail-call recursion. I would be willing to bet that if you looked at the IL for this code you would see a beautiful
tail. call in there. Excellent work!Overall, beautiful work. I think you did a phenomenal job with this code. :)
Code Snippets
let occurrence = (head :: other |> Array.ofList |> String)let updatedMap = match occurrenceTable.ContainsKey (occurrence) with
| true -> occurrenceTable.Add(occurrence, occurrenceTable.[occurrence] + 1.0)
| false -> occurrenceTable.Add(occurrence, 1.0)let updatedMap =
match occurrenceTable.ContainsKey (occurrence) with
| true -> occurrenceTable.Add(occurrence, occurrenceTable.[occurrence] + 1.0)
| false -> occurrenceTable.Add(occurrence, 1.0)let total = Map.fold (fun acc key value -> acc + value) 0.0 occurrenceTablelet rec countOccurrences (input:string) (occurrenceTable:Map<string, float>) length =Context
StackExchange Code Review Q#151286, answer score: 2
Revisions (0)
No revisions yet.