HiveBrain v1.2.0
Get Started
← Back to all entries
patterngoMinor

Concurrency interview

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
concurrencyinterviewstackoverflow

Problem

A little while back I had an interview where they posed a problem, in summary:

  • Watch a certain directory, and process incoming JSON files



  • These JSON files have various "Type" fields



  • Every second report out how many files of each "Type" were processed, and the average processing time (from added to disk, to added to sums) for all.



The solution was supposed to optimize for a minimal "processing time" which is why I chose a concurrent approach.

After I submitted the solution, I didn't hear back for about a month, got rejected and had absolutely no feedback on my solution. Since I've never done anything like this before, I expected it has problems and I greatly appreciate any feedback.

Here's how I represent an Event:

type Event struct {
    Type string
    CreatedAt time.Time
}


Here's my main loop:

func main() {
    config := Config{}
    config.readFrom("./config.json")

    fmt.Println("Monitoring...")
    fmt.Println("Press Enter to Exit")
    fmt.Println("--------------------\n")

    foundEvents := make(chan Event)
    normalizedEvents := make(chan Event)
    ticker := time.NewTicker(1 * time.Second)

    go discoverNewEvents(config, foundEvents)
    go normalizeEvents(foundEvents, normalizedEvents)
    go outputOnTick(normalizedEvents, ticker)

    bufio.NewReader(os.Stdin).ReadString('\n')
}


Here's discoverNewEvents which watches the directory and deserializes the json

```
func discoverNewEvents(appConfig Config, out chan Event) {
for _ = range time.Tick(1 * time.Microsecond) {
filesInfo, _ := ioutil.ReadDir(appConfig.InputDirectory)

for _, fileInfo := range filesInfo {
inputPath := path.Join(appConfig.InputDirectory, fileInfo.Name())
processedPath := path.Join(appConfig.ProcessedDirectory, fileInfo.Name())
rawJson, _ := ioutil.ReadFile(inputPath)

event := Event{}
_ = json.Unmarshal(rawJson, &event)

event.CreatedAt = fileInfo.ModTim

Solution

The main issue I see is that it appears you are busy waiting on the directory, i.e. you are reading its contents at regular intervals to see if there are new files.

Note - I have not written a single golang program, so correct me if I'm wrong, but are you trying to poll the directory every microsecond?

The right approach is to use file system notifications from the operating system. For instance, check out this library:

https://github.com/go-fsnotify/fsnotify

Using file system notifications your program can remain idle until something interesting happens at which point you receive an event on a channel.

This is an interesting architecture problem and I might have more to say about the rest of your program later.

A few other minor things...

The variable filesInfo should be named fileInfos - i.e. the 's' should be on the end. filesInfo conveys the idea of a singular piece of information about a bunch of files, whereas fileInfos is more clearly a plurality of file information.

Also, "info" is a very generic term. Everything is information. It appears that fileInfo is really a "directory entry". You'll see the term "dirent" commonly used in the POSIX world to describe a directory entry, e.g. this man page for the readdir() system call:

http://man7.org/linux/man-pages/man3/readdir.3.html

Context

StackExchange Code Review Q#97229, answer score: 2

Revisions (0)

No revisions yet.