HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

HTTP site scraper

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
scraperhttpsite

Problem

A job application of mine has been declined because the test project I submitted was not coded in a clean and straightforward way.

Fine, but that's all the feedback I got. Since I like to continuously improve my coding skills, are there people here who want to check out this project at Github? It's not complicated of course and it would be really helpful for me.

The README contains the assignment specifics.

GitHub

Crux of the requirements:

Functionality


The main should define and run 3 requests
SIMULTANEOUSLY, each request is defined below:



-
10thLetterRequest:


Grab a website’s content from the web
Hold the web page content as a String and make it accessible from the Main
Process the web page content: Find the 10th letter in the web page text and report it back to the Main program via a callback.

-
Every10thLetterRequest:


Grab a website’s content from the web
Hold the web page content as a String and make it accessible from the Main
Process the web page content: Find every 10th letter(i.e: 10th, 20th, 30th etc.) in the web page text and report it back to the Main
program via a callback. This callback should bring an appropriate data
structure.

-
WordCounterRequest:


Grab a website’s content from the web
Hold the web page content as a String and make it accessible from the Main
Process the web page content: Split the text into words by using whitespace characters (i.e: space, linefeed etc.) and write a simple
algorithm to count every word in the document and report it back to
the Main program via a callback. You can disregard html/javascript
etc. and treat every word equally. The callback should bring an
appropriate data structure of words and counts. So the main program
should be able to ask how many times a certain word appears in the
website.


Core code:

```
//
// dataRequest.m
// assignment
//
// Created by Mathijs on 2013-12-19.
// Copyright (c) 2013 Mathijs Vreeman. All

Solution

I have to agree with the evaluation.

10thLetterRequest and Every10thLetterRequest are pretty much exactly the same, except that 10thLetterRequest only requires the first letter.

Yet, you decided to use different regexes and have no re-use of code.

The last assignment asks you return all words with a word count, instead of that you search for 1 word. You basically misunderstood the question completely.

Other minor things :

-
The name getTenthLetterAndWhenComplete lies, since you return a string, I would expect this to return just the 10th letter.

-
Returning a comma separate string with the tenth letters is not an appropriate data structure for getEveryTenthLetterAndWhenComplete, even worse you build up an entire array and then throw it away.

Context

StackExchange Code Review Q#38907, answer score: 6

Revisions (0)

No revisions yet.