patternMinor
Analysis tool for a log file
Viewed 0 times
fileloganalysisfortool
Problem
I've written a bit of code designed to go through a log file and determine successes and failures for certain criteria for each server in an environment. The code works just fine, except that on environments of larger size, it's just too slow. Currently it's running on a 2000 server environment with 18 criteria and 5000 lines of logs being checked and takes about 14 minutes on my machine. The problem is that it's intended to be used on an environment of about 18000 servers with about 40 criteria and 175,000 lines of logs.
I'm hoping someone can help me come up with something I'm not seeing right now that would help with the speed, because waiting an hour and hoping Excel doesn't crash doesn't really speed up this process.
Here's what I'm trying to do:
This is the code I'm running (there are comments in there to help understanding):
```
Dim lastCol As Double, lastRow As Double, lastSensorRow As Double
Sub B_L3ToolRun()
SpeedUp
Dim numSuc As Long, numMaj As Long, numWar As Long, numCri As Long, numErr As Long
lastRow = Sheets("Results").Range("A" & Rows.Count).End(xlUp).Row //End of IPs
lastSensorRow = Sheets("Data").Range("A" & Rows.Count).End(xlUp).Row //End of Log File
lastCol = 3 //Minimum columns used
For I = 4 To 150 //Discovers number of criteria used
If Cells(1, I).Value <> "" Then
lastCol = lastCol + 1
End If
Next I
Sheets("Results").Activate
For J = 3 To lastCol //This is a check for each criteria being checked
For I = 2 To lastRow //This is a check for each IP in the list (2000)
numSuc = WorksheetFunction.CountIfs(Sheets("Data").Range("B:B"), "" & Sheets("Results").Cells(1, J).Value & "", _ //This checks for number of given criteria
Sheets("Data").Range("E:E"), "=normal", _ //This refines it to only successful ones
I'm hoping someone can help me come up with something I'm not seeing right now that would help with the speed, because waiting an hour and hoping Excel doesn't crash doesn't really speed up this process.
Here's what I'm trying to do:
This is the code I'm running (there are comments in there to help understanding):
```
Dim lastCol As Double, lastRow As Double, lastSensorRow As Double
Sub B_L3ToolRun()
SpeedUp
Dim numSuc As Long, numMaj As Long, numWar As Long, numCri As Long, numErr As Long
lastRow = Sheets("Results").Range("A" & Rows.Count).End(xlUp).Row //End of IPs
lastSensorRow = Sheets("Data").Range("A" & Rows.Count).End(xlUp).Row //End of Log File
lastCol = 3 //Minimum columns used
For I = 4 To 150 //Discovers number of criteria used
If Cells(1, I).Value <> "" Then
lastCol = lastCol + 1
End If
Next I
Sheets("Results").Activate
For J = 3 To lastCol //This is a check for each criteria being checked
For I = 2 To lastRow //This is a check for each IP in the list (2000)
numSuc = WorksheetFunction.CountIfs(Sheets("Data").Range("B:B"), "" & Sheets("Results").Cells(1, J).Value & "", _ //This checks for number of given criteria
Sheets("Data").Range("E:E"), "=normal", _ //This refines it to only successful ones
Solution
Things I like:
-
You've used meaningful variable names (I think
-
Your code is nicely indented, good use of
-
Your code is nicely commented. This combined with the above 2 points makes it very easy to understand and follow.
And now for the review:
Tips and Tricks
Codenames
All sheets have a "Name" which is what the user sees and can edit.
Assuming your "Data" sheet has codename "ws_Data", these 2 statements are equivalent:
Always reference codenames and not only can you give your sheet variables meaningful names, there's no danger of someone changing the name and breaking your macro.
Final Row/Column
Personally, I recommend
No strings, no loops, just neat, clean and simple.
Proper Variable Scoping
Variables should be declared as close as is practicable to where they're used. Crucially, Variables should be Procedure-Level where possible, then Module-Level where possible, then and only then Global level.
Why on earth are these Module-Level variables? There's only one procedure. Develop good habits now and put them inside their procedure where they belong. As a side note, if you only ever use
And now for the performance optimisation
You have 2 options:
Create a new sheet, fill in a countifs() that references the other sheet and auto-fill a grid as appropriate. Excel is highly optimised for large grids of worksheet functions.
Accessing a worksheet is slow. And you are accessing it every time you specify a cell reference. An array, on the other hand, is just data laid out in sequential memory blocks, so accessing a location in an array is orders of magnitude faster than accessing a location in a worksheet.
Start your program doing this:
For each worksheet:
You're going to be referencing the same 3 columns over and over again, so store them in their own arrays:
Repeat for the other 2 columns, and you now have 3 lists of column data stored in arrays.
Now just replace your
And this should run at least an order of magnitude faster.
Optimisation Summary
If you're working with data, then strip away all of the excess (containers, sheets, formatting, highlighting etc.) and put the raw data in an Array.
If you can (easily) re-create a worksheet function in VBA, then do so.
If you're going to be referencing the same things over and over, store them in a variable and reference the variable. The same goes for the results of calculations (A.K.A. memoisation).
Get really familiar with using Arrays and moving data from sheets to arrays and back again. It'll save you a lot of trouble down the road.
-
You've used meaningful variable names (I think
numMajorErr would be an even better convention for your counters but it's a minor quibble)-
Your code is nicely indented, good use of
_ to split your functions onto multiple lines for readability.-
Your code is nicely commented. This combined with the above 2 points makes it very easy to understand and follow.
And now for the review:
Tips and Tricks
Codenames
All sheets have a "Name" which is what the user sees and can edit.
Sheets("Data") is referencing a sheet name. However, names can change, especially when users are involved. Codenames, on the other hand, can only be set/changed in the VBE:Assuming your "Data" sheet has codename "ws_Data", these 2 statements are equivalent:
Sheets("Data").Range(), ws_Data.Range.Always reference codenames and not only can you give your sheet variables meaningful names, there's no danger of someone changing the name and breaking your macro.
Final Row/Column
Personally, I recommend
finalRow = Cells(Rows.Count, 1).End(xlUp).Row
finalColumn = Cells(1, Columns.Count).End(xlToLeft).ColumnNo strings, no loops, just neat, clean and simple.
Proper Variable Scoping
Variables should be declared as close as is practicable to where they're used. Crucially, Variables should be Procedure-Level where possible, then Module-Level where possible, then and only then Global level.
Dim lastCol As Double, lastRow As Double, lastSensorRow As DoubleWhy on earth are these Module-Level variables? There's only one procedure. Develop good habits now and put them inside their procedure where they belong. As a side note, if you only ever use
Dim for Procedure-Level, Private for Module-Level and Public for Global-Level variables, it will make reviewing your code (especially on larger projects) even easier.And now for the performance optimisation
You have 2 options:
- Just put your worksheet functions in an actual worksheet.
Create a new sheet, fill in a countifs() that references the other sheet and auto-fill a grid as appropriate. Excel is highly optimised for large grids of worksheet functions.
- Which I will focus on here: Store all the sheet data in arrays and work from those.
Accessing a worksheet is slow. And you are accessing it every time you specify a cell reference. An array, on the other hand, is just data laid out in sequential memory blocks, so accessing a location in an array is orders of magnitude faster than accessing a location in a worksheet.
Start your program doing this:
Dim wsResultsTable As Variant
wsResultsTable = Array()
Dim wsDataTable As Variant
wsDataTable = Array()For each worksheet:
'/ Determine Range of worksheet Data
'/ = ws.Range()
'/ =
'/ Now Array(1,1) = Cells(1,1).value etc.You're going to be referencing the same 3 columns over and over again, so store them in their own arrays:
Dim bColumnData As Variant
bColumnData = Array()
Dim colIndex As Long
colIndex = 2 ' "B" Column
Dim LB1 As Long, UB1 As Long ' Determine L/U bounds of the array
LB1 = LBound(wsDataTable, 1)
UB1 = UBound(wsDataTable, 1)
ReDim bColumnData(LB1 To UB1) ' Create 1-D array of the same size
For i = LB1 To UB1
bColumnData(i) = wsDataTable(i, colIndex)
Next iRepeat for the other 2 columns, and you now have 3 lists of column data stored in arrays.
Now just replace your
countIfs with a VBA loop:for a
for b
for i = LB1 to UB1
if then = + 1
if then = + 1
if then = + 1
if then = + 1
next i
'/ Print results to sheet etc.
next b
next aAnd this should run at least an order of magnitude faster.
Optimisation Summary
If you're working with data, then strip away all of the excess (containers, sheets, formatting, highlighting etc.) and put the raw data in an Array.
If you can (easily) re-create a worksheet function in VBA, then do so.
If you're going to be referencing the same things over and over, store them in a variable and reference the variable. The same goes for the results of calculations (A.K.A. memoisation).
Get really familiar with using Arrays and moving data from sheets to arrays and back again. It'll save you a lot of trouble down the road.
Code Snippets
finalRow = Cells(Rows.Count, 1).End(xlUp).Row
finalColumn = Cells(1, Columns.Count).End(xlToLeft).ColumnDim lastCol As Double, lastRow As Double, lastSensorRow As DoubleDim wsResultsTable As Variant
wsResultsTable = Array()
Dim wsDataTable As Variant
wsDataTable = Array()'/ Determine Range of worksheet Data
'/ <Range> = ws.Range()
'/ <Array> = <Range>
'/ Now Array(1,1) = Cells(1,1).value etc.Dim bColumnData As Variant
bColumnData = Array()
Dim colIndex As Long
colIndex = 2 ' "B" Column
Dim LB1 As Long, UB1 As Long ' Determine L/U bounds of the array
LB1 = LBound(wsDataTable, 1)
UB1 = UBound(wsDataTable, 1)
ReDim bColumnData(LB1 To UB1) ' Create 1-D array of the same size
For i = LB1 To UB1
bColumnData(i) = wsDataTable(i, colIndex)
Next iContext
StackExchange Code Review Q#111736, answer score: 7
Revisions (0)
No revisions yet.