patternMinor
Extracting information from spreadsheet into 12 tables (one per month)
Viewed 0 times
tablesperintospreadsheetoneextractingmonthfrominformation
Problem
I have a file with more than 100k rows, but the structure is simple:
Result should be twelve identically formed tables from "Jan" to "Dec" sliced from "Name-Position-Color" column sheet1 placed into sheet3 as "Name-slice" -column and "Position-slice" -head row. The "color" part is no longer needed. Tables should be filled with multiplied "Name-slice" by "Position-slice" including a period in which they're positioned in the first list. I hope this is informative enough to understand.
So, I've managed to write the macro below, but it works really slowly even when I have only 228 rows in the list. It had worked fast just before I've added a calculation part. I think object programming could save some time, but I haven't learned it yet. I would really appreciate if someone could tell me the way to improve my code, so it'll work faster. Any advice would be really helpful too.
```
Sub tablesByMonths()
'def column in sheet1
colNum1 = 2
'def column in sheet3
colNum3 = 2 '2 is minimal for correct macro work
'def last row in sheet1
lastRow1 = Worksheets("Sheet1").Cells(Rows.Count, colNum1).End(xlUp).Row
'def first row in sheet1
firstRow1 = Worksheets("Sheet1").Cells(Rows.Count, colNum1).End(xlUp).End(xlUp).Row + 1
'def last row in sheet3
step = 2
Application.ScreenUpdating = False 'turns off dynamic screen update
Application.Calculation = xlCalculationManual 'turns off automatic formulas
'clears all used range in a sheet3
Worksheets("Sheet3").UsedRan
Date _|_Name-Position-Color_|_Summ
17.11.2015 |"Name1 | 8813,52
| Position1 |
_|_Color1" _|_
19.08.2015 |"Name2 | 3587,86
| Position3 |
_|_Color5" _|_
12.01.2015 |"Name3 | 14,63
| Position16 |
_|_Color7" _|_
07.12.2015 |"Name4 | 7129,97
| Position11 |
| Color3" |Result should be twelve identically formed tables from "Jan" to "Dec" sliced from "Name-Position-Color" column sheet1 placed into sheet3 as "Name-slice" -column and "Position-slice" -head row. The "color" part is no longer needed. Tables should be filled with multiplied "Name-slice" by "Position-slice" including a period in which they're positioned in the first list. I hope this is informative enough to understand.
So, I've managed to write the macro below, but it works really slowly even when I have only 228 rows in the list. It had worked fast just before I've added a calculation part. I think object programming could save some time, but I haven't learned it yet. I would really appreciate if someone could tell me the way to improve my code, so it'll work faster. Any advice would be really helpful too.
```
Sub tablesByMonths()
'def column in sheet1
colNum1 = 2
'def column in sheet3
colNum3 = 2 '2 is minimal for correct macro work
'def last row in sheet1
lastRow1 = Worksheets("Sheet1").Cells(Rows.Count, colNum1).End(xlUp).Row
'def first row in sheet1
firstRow1 = Worksheets("Sheet1").Cells(Rows.Count, colNum1).End(xlUp).End(xlUp).Row + 1
'def last row in sheet3
step = 2
Application.ScreenUpdating = False 'turns off dynamic screen update
Application.Calculation = xlCalculationManual 'turns off automatic formulas
'clears all used range in a sheet3
Worksheets("Sheet3").UsedRan
Solution
First some code style points:
variables! Not only
does it help prevent a slew of bugs from cropping up, it makes it a
lot easier to read and understand what your code is doing. Looking
at only the code that was posted, I have no idea if they are local
or global scope or what data type they are. These are important
readability clues, which leads to the second point:
only the how if it isn't obvious. Who and when are strictly optional, but I generally remember if I've written a piece of code and Excel has file modified dates. For example - if I can't figure
out that
screen update", then I probably don't have any business mucking
around in there anyway. Which leads to the next point:
this long, I shouldn't have to scroll all the way to the top in
order to determine that
Sheet3. Your code should strive to be self documenting. There's a reason I can figure out that
make it a constant. This way it's both clear that the value isn't
going to change in your code, and just as important, prevents you
from accidentally assigning another value to it (see point 1). In
your case,
as constants.
with 2 constants repeatedly, the result of that calculation should
also be a constant. For example, you compute the value
different places in your
While the code with happily compute that 2 + 1 = 3 millions of times
without complaint, keep in mind that everyone already knows that
if
-
Expanding on point five even more... if you have to perform the same
calculation repeatedly, it should be in a variable. Consider the
calculation
every time through your outer loop and
the 4 times that you use
adding
calculation more than once, store it in a variable and use that.
-
Speaking of the
as variable names. Is this a typo?
looks like one until I realize that
variable. It also avoids code lines such as
Luckily the parser doesn't let you do
loop counters after the
difference (but doesn't mean you shouldn't indent too).
you have a line of code that is 346 characters long before you
indent it. It's a lot harder to understand what code is doing if
you can't see it. By the time I finished scrolling to the right, it
was the only thing I could see anywhere in the code editor.
makes it more difficult to determine at a glance what code is doing. Vertical scrolling does the same thing. The second thing I did after popping this code into the VBE (see point 6 for the first
thing I did) was to delete all of the empty lines and comment only
lines (see point 2) so I could see more of it at the same time.
Close, but not quite, even after closing the Immediate Window that
makes me feel naked if it isn't docked at the bottom of the VBE
window. The best way to address this is by splitting your
smaller pieces that each address a more tightly focused aspect of
the problem. Extract functions until you have a set of small,
manageable pieces of code.
Performance and Excel specific:
Every time you type a
- Declare your
variables! Not only
does it help prevent a slew of bugs from cropping up, it makes it a
lot easier to read and understand what your code is doing. Looking
at only the code that was posted, I have no idea if they are local
or global scope or what data type they are. These are important
readability clues, which leads to the second point:
- Comments should primarily explain the why, not the what, and
only the how if it isn't obvious. Who and when are strictly optional, but I generally remember if I've written a piece of code and Excel has file modified dates. For example - if I can't figure
out that
Application.ScreenUpdating = False "turns off dynamicscreen update", then I probably don't have any business mucking
around in there anyway. Which leads to the next point:
- Give your variables meaningful names. When going through a
Sub
this long, I shouldn't have to scroll all the way to the top in
order to determine that
colNum3 is the default(?) column inSheet3. Your code should strive to be self documenting. There's a reason I can figure out that
Application.ScreenUpdating = False means "stop the screen from refreshing". If the Office team had decided that the property should be named .HazBlitting and take the enum arguments xlRly and xlOmg, then it wouldn't be so obvious. By the same token, if the loop For per = 1 To 12 were For monthNum = 1 To 12, then it would be obvious enough that the comment 'this counts months from Jan to Dec would be unnecessary. - If you have a variable that you never assign a different value to,
make it a constant. This way it's both clear that the value isn't
going to change in your code, and just as important, prevents you
from accidentally assigning another value to it (see point 1). In
your case,
colNum1, colNum3, and Step should all be declaredas constants.
- Taking point 4 one step further, if you perform the same calculation
with 2 constants repeatedly, the result of that calculation should
also be a constant. For example, you compute the value
colNum3 + 1 at 13different places in your
Sub and do it in a deeply nested loop. In fact, there are only 11 times that you don't add one to it.While the code with happily compute that 2 + 1 = 3 millions of times
without complaint, keep in mind that everyone already knows that
if
colNum3 is always 2, then colNum3 + 1 will always be 3.-
Expanding on point five even more... if you have to perform the same
calculation repeatedly, it should be in a variable. Consider the
calculation
lastRow3_1 + Step. lastRow3_1 only changes onceevery time through your outer loop and
Step is a constant. But inthe 4 times that you use
lastRow3_1 you never use it withoutadding
Step to it. If you're going to use the result of acalculation more than once, store it in a variable and use that.
-
Speaking of the
Step variable, you should avoid using VBA keywordsas variable names. Is this a typo?
For Foo = 1 To Step? Surelooks like one until I realize that
Step is being used as avariable. It also avoids code lines such as
For Polka = Pause To Step
Step Step, which makes my brain bleed out my ear a little bit.Luckily the parser doesn't let you do
If Like Like Like Then or If Then And If Then, but I digress...- Be consistent with your indentation and go in one level each
Loop,If,Sub/Function,Select,Case, (and the others that don't immediately spring to mind because I do it so habitually now). I almost missed the fact that
For per = 1 To 12 doesn't continue until the very bottom of theSub. Indenting makes it obvious. That said, good job putting theloop counters after the
Next at the end. That makes a hugedifference (but doesn't mean you shouldn't indent too).
_is VBA's line continuation operator. Use it if, for example,
you have a line of code that is 346 characters long before you
indent it. It's a lot harder to understand what code is doing if
you can't see it. By the time I finished scrolling to the right, it
was the only thing I could see anywhere in the code editor.
- Speaking of scrolling, side-scrolling isn't the only scrolling that
makes it more difficult to determine at a glance what code is doing. Vertical scrolling does the same thing. The second thing I did after popping this code into the VBE (see point 6 for the first
thing I did) was to delete all of the empty lines and comment only
lines (see point 2) so I could see more of it at the same time.
Close, but not quite, even after closing the Immediate Window that
makes me feel naked if it isn't docked at the bottom of the VBE
window. The best way to address this is by splitting your
Sub intosmaller pieces that each address a more tightly focused aspect of
the problem. Extract functions until you have a set of small,
manageable pieces of code.
Performance and Excel specific:
Every time you type a
. between two tokens (except in a quoted literContext
StackExchange Code Review Q#124344, answer score: 5
Revisions (0)
No revisions yet.