HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Extracting information from spreadsheet into 12 tables (one per month)

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
tablesperintospreadsheetoneextractingmonthfrominformation

Problem

I have a file with more than 100k rows, but the structure is simple:

Date      _|_Name-Position-Color_|_Summ
17.11.2015 |"Name1               | 8813,52
           | Position1           |
          _|_Color1"            _|_
19.08.2015 |"Name2               | 3587,86
           | Position3           |
          _|_Color5"            _|_
12.01.2015 |"Name3               | 14,63
           | Position16          |
          _|_Color7"            _|_
07.12.2015 |"Name4               | 7129,97
           | Position11          |
           | Color3"             |


Result should be twelve identically formed tables from "Jan" to "Dec" sliced from "Name-Position-Color" column sheet1 placed into sheet3 as "Name-slice" -column and "Position-slice" -head row. The "color" part is no longer needed. Tables should be filled with multiplied "Name-slice" by "Position-slice" including a period in which they're positioned in the first list. I hope this is informative enough to understand.

So, I've managed to write the macro below, but it works really slowly even when I have only 228 rows in the list. It had worked fast just before I've added a calculation part. I think object programming could save some time, but I haven't learned it yet. I would really appreciate if someone could tell me the way to improve my code, so it'll work faster. Any advice would be really helpful too.

```
Sub tablesByMonths()

'def column in sheet1
colNum1 = 2
'def column in sheet3
colNum3 = 2 '2 is minimal for correct macro work
'def last row in sheet1
lastRow1 = Worksheets("Sheet1").Cells(Rows.Count, colNum1).End(xlUp).Row
'def first row in sheet1
firstRow1 = Worksheets("Sheet1").Cells(Rows.Count, colNum1).End(xlUp).End(xlUp).Row + 1
'def last row in sheet3
step = 2

Application.ScreenUpdating = False 'turns off dynamic screen update
Application.Calculation = xlCalculationManual 'turns off automatic formulas

'clears all used range in a sheet3
Worksheets("Sheet3").UsedRan

Solution

First some code style points:

  • Declare your


variables! Not only
does it help prevent a slew of bugs from cropping up, it makes it a
lot easier to read and understand what your code is doing. Looking
at only the code that was posted, I have no idea if they are local
or global scope or what data type they are. These are important
readability clues, which leads to the second point:

  • Comments should primarily explain the why, not the what, and


only the how if it isn't obvious. Who and when are strictly optional, but I generally remember if I've written a piece of code and Excel has file modified dates. For example - if I can't figure
out that Application.ScreenUpdating = False "turns off dynamic
screen update", then I probably don't have any business mucking
around in there anyway. Which leads to the next point:

  • Give your variables meaningful names. When going through a Sub


this long, I shouldn't have to scroll all the way to the top in
order to determine that colNum3 is the default(?) column in
Sheet3. Your code should strive to be self documenting. There's a reason I can figure out that Application.ScreenUpdating = False means "stop the screen from refreshing". If the Office team had decided that the property should be named .HazBlitting and take the enum arguments xlRly and xlOmg, then it wouldn't be so obvious. By the same token, if the loop For per = 1 To 12 were For monthNum = 1 To 12, then it would be obvious enough that the comment 'this counts months from Jan to Dec would be unnecessary.

  • If you have a variable that you never assign a different value to,


make it a constant. This way it's both clear that the value isn't
going to change in your code, and just as important, prevents you
from accidentally assigning another value to it (see point 1). In
your case, colNum1, colNum3, and Step should all be declared
as constants.

  • Taking point 4 one step further, if you perform the same calculation


with 2 constants repeatedly, the result of that calculation should
also be a constant. For example, you compute the value colNum3 + 1 at 13
different places in your Sub and do it in a deeply nested loop. In fact, there are only 11 times that you don't add one to it.
While the code with happily compute that 2 + 1 = 3 millions of times
without complaint, keep in mind that everyone already knows that
if colNum3 is always 2, then colNum3 + 1 will always be 3.

-
Expanding on point five even more... if you have to perform the same
calculation repeatedly, it should be in a variable. Consider the
calculation lastRow3_1 + Step. lastRow3_1 only changes once
every time through your outer loop and Step is a constant. But in
the 4 times that you use lastRow3_1 you never use it without
adding Step to it. If you're going to use the result of a
calculation more than once, store it in a variable and use that.

-
Speaking of the Step variable, you should avoid using VBA keywords
as variable names. Is this a typo? For Foo = 1 To Step? Sure
looks like one until I realize that Step is being used as a
variable. It also avoids code lines such as For Polka = Pause To Step
Step Step
, which makes my brain bleed out my ear a little bit.
Luckily the parser doesn't let you do If Like Like Like Then or If Then And If Then, but I digress...

  • Be consistent with your indentation and go in one level each Loop, If, Sub/Function, Select, Case, (and the others that don't immediately spring to mind because I do it so habitually now). I almost missed the fact that


For per = 1 To 12 doesn't continue until the very bottom of the
Sub. Indenting makes it obvious. That said, good job putting the
loop counters after the Next at the end. That makes a huge
difference (but doesn't mean you shouldn't indent too).

  • _ is VBA's line continuation operator. Use it if, for example,


you have a line of code that is 346 characters long before you
indent it. It's a lot harder to understand what code is doing if
you can't see it. By the time I finished scrolling to the right, it
was the only thing I could see anywhere in the code editor.

  • Speaking of scrolling, side-scrolling isn't the only scrolling that


makes it more difficult to determine at a glance what code is doing. Vertical scrolling does the same thing. The second thing I did after popping this code into the VBE (see point 6 for the first
thing I did) was to delete all of the empty lines and comment only
lines (see point 2) so I could see more of it at the same time.
Close, but not quite, even after closing the Immediate Window that
makes me feel naked if it isn't docked at the bottom of the VBE
window. The best way to address this is by splitting your Sub into
smaller pieces that each address a more tightly focused aspect of
the problem. Extract functions until you have a set of small,
manageable pieces of code.

Performance and Excel specific:

Every time you type a . between two tokens (except in a quoted liter

Context

StackExchange Code Review Q#124344, answer score: 5

Revisions (0)

No revisions yet.