HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Collect and calculate average times from log, then display top 10 longest durations

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
collectdurationstoploglongestaveragedisplaycalculatethenand

Problem

Here's a novel-length summary of the issue:

I'm trying to write a VB.net program to help me collect remote site statistics from system-generated logs, but I'm a little like a carpenter who only knows how to use a hammer, and my project has turned into a bit of a monstrosity; as embarrassing as my code is, I would really love to get some professional opinions on how I can make it more streamlined and efficient, and generally less embarrassing.

Here's the basic rundown of relevant program functionality:

  • The user can select up to five plaintext log files, each of which can be relatively long (the longest I have available for testing is 26k lines).



  • The program reads through every line of each file in turn, using IO.File.ReadLines, looking for relevant entries (in this case, every time a terminal goes UP or DOWN), and records the information in an "entry" object, which is stored in a list of entry objects. (At this point, I do a lot with the entries, but I'm going to focus just on one activity for this question).



  • To find individual site outages, the program reads through the list of entries until it finds the first "DOWN" entry. It records the site ID, the site's group ID, and the outage start time. At this point, things start to get grossly inefficient.



  • After it has collected the information listed in step 3, it records the current entry list index as a bookmark, then proceeds to look through all the following indices until it finds the next entry with that site ID; if that entry has an "UP" status, then it records that entry time as the outage end time, and calculated the total duration of the outage, then it goes back to the bookmark to look for the next outage start time. If it's a "DOWN" status, it scraps the current outage and goes back to the bookmark to look for the next outage start time. All of this information (and that recorded from step 3) is recorded in an "outage" object, and stored in a list of outages. This step takes an extremely l

Solution

It seems to me that the main thing you need to do is sort out your logic.

The way I'd expect things to work would be that for any given site, a DOWN entry indicates that site has gone down, and it's counted as remaining down until you see an UP entry for the same site. If you see a number of DOWN messages for one site without an intervening UP message, it just means the site went down, and remained down.

That does not seem to fit your description though. Based on how you've described the logic, it sounds like when/if you see two DOWN messages for a given site, you (effectively) re-start the search from the latter of the two. That would mean that given N consecutive DOWN entries for one site, you count the site as being down only from the last of these messages until the following UP message (for the same site). If that's the case, you can speed up the search quite a bit by simply keeping track of the most recently-seen DOWN entry for a given site, and when you see an UP message for that site, you then enter it (along with the preceding DOWN message to the outages collection.

Either way, it sounds like your complexity is currently (roughly) O(N2), but can be reduced to O(N), which is likely to give quite a large improvement. I don't think it makes a lot of sense spend much time looking at the code itself until it's clear what logic it really implements, and (especially) that the logic it implements represents a good way to solve your problem.

Context

StackExchange Code Review Q#43350, answer score: 7

Revisions (0)

No revisions yet.