HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Counting and Deleting Duplicate Data

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
deletingcountingduplicateanddata

Problem

Can anyone help me optimize the current VBA code below? It is taking 20 seconds currently to collect,store, check and validate 1000 data's.

This program takes specific values from multiple sheets and does addition as the values goes beyond one column or row so loop for addition. And uses 4 values to check for duplicate data and then if found it add's the total and prints in one row and deletes the duplicate. And finally Checking the value and alignment as well.

```
Option Explicit
Public s, c, r2, r, i, j, k, i3, i1, i2, r1, j1, j2, i4, r3, sum As Long
Public wh As String
Public ws, wd As Worksheet
Public Declare Function GetTickCount Lib "kernel32.dll" () As Long
Sub MainStart()
With Application
.ScreenUpdating = False
.EnableEvents = False
End With
Dim wsTest As Worksheet
Const strSheetName As String = "Report"

Set wsTest = Nothing
On Error Resume Next
Set wsTest = ActiveWorkbook.Worksheets(strSheetName)
On Error GoTo 0
If wsTest Is Nothing Then
Worksheets.Add.Name = strSheetName
End If
Application.ScreenUpdating = False
i1 = 1
j1 = 2
j2 = 2
i2 = 2
For Each ws In ThisWorkbook.Worksheets
If ws.Name <> "Report" And ws.Name <> "Pending Tasks" Then
With ws
wh = ws.Name
r = Worksheets(wh).Range("c5").SpecialCells(xlCellTypeLastCell).Row ' getting No of row in a sheet
r = r - 1 ' To remove the total row which may not be required
c = Worksheets(wh).Range("A13").SpecialCells(xlCellTypeLastCell).Column ' To get the no coloumn
c = c - 1
Cells(1, 1).Value = "Release"
Cells(1, 2).Value = "Project ID"
Cells(1, 3).Value = "Sub Category"
Cells(1, 4).Value = "ROM"
Cells(1, 5).Value = "Hours Spent"
Cells(1, 6).

Solution

As I alluded to in my comment, you have bigger concerns than performance right now. I don't know who said it first, but I've repeated it here many times.



  • Make it work.



  • Make it right.



  • Make it fast.




In that order.

Let's see what we can do to make this code right. I'll try to be gentle, but prepare yourself. This might be rough to hear at times.

You used Option Explicit. That's good. Unfortunately, that's more or less where the good ends.

Public s, c, r2, r, i, j, k, i3, i1, i2, r1, j1, j2, i4, r3, sum As Long


-
This doesn't do what you think it does. The only variable declared as a Long here is the very last one (sum). The rest are declared as Variants. Properly declaring them as Long types will remove some overhead.

Public s As Long, c As Long, r2 As Long ' etc.


but don't do that either. Make all of these Private and declare them on their own lines.

Private s As Long
Private c As Long


-
How is anyone, including yourself 6 months from now supposed to know what these variables represent? Naming is one of the hardest things in computer science, but it's also one of the most important. Variables should be descriptive enough to understand what data they hold at a glance and never should you number them. Never. If you think you need to number a variable, you likely need to restrict another variables scope, extract a function/sub, or add a loop.

r and c stand for row and column so what not just say so? You'll save yourself a ton of grief later. (I've not dug far enough into the code at this time to recommend better names for the others.)

On Error Resume Next
    Set wsTest = ActiveWorkbook.Worksheets(strSheetName)
On Error GoTo 0


This is not error handling. This is swallowing errors and telling your code "It's okay, just run with it. It's cool. Don't worry that we have the wrong worksheet..."

A proper error handler looks like this.

Public Sub Foo()
On Error GoTo ErrHandler

    ' do something that might fail

    Exit Sub

CleanExit:

    Application.EnableEvents = True
    Application.ScreenUpdating = True

ErrHandler:

    ' do something to actually handle the error and Resume Next OR

    ' if you can't recover then

    Resume CleanExit


This also doesn't do what you think it does and is a big bug waiting to peek up its ugly head.

With ws
    Cells(1, 1).Value = "Release"


An unqualified call to Cells will implicitly work on the ActiveSheet, whatever it may be. What I believe you meant to do is this.

With ws
    .Cells(1,1).Value = "Release"


Which is a really good reason to not abuse the With statement like you have here. Just explicitly call this on ws and be done with it.

ws.Cells(1,1).Value = "Release"
ws.Cells(1,2).Value = "Project ID"
' ...


Yes it's more typing. Get over it. It will save you headaches down the road. I promise.


wh = ws.Name

Okay, so wh is the source worksheet's name. Change wh accordingly.


r = Worksheets(wh).Range("c5").SpecialCells(xlCellTypeLastCell).Row

You already have a reference to that worksheet stored in ws (which will henceforth be referred to as sourceSheet by the way), there's no reason to get the worksheet from the collection via its name.

row = ws.Range("C5").SpecialCells(xlCellTypeLastCell).Row
row = row - 1                                              ' To remove the total row which may not be required


You can also save yourself an entire line of code by simply subtracting one from that value right now instead of waiting. Keep the comment. It's a good one, but don't try to right align your comments like this. The second it gets copy/pasted anywhere you'll lose the alignment and waste your time trying to line it back up.

row = ws.Range("C5").SpecialCells(xlCellTypeLastCell).Row - 1 ' subtract one to remove the total row


Okay, so here's an actual performance improvement for you. Copy/Paste is slow. Only use it if you're copying a big range of data all at once. Here you're copying one cell at a time to a new location.

For i = 5 To r
            Worksheets(wh).Cells(i, 1).Copy Destination:=Sheets("Report").Range("A" & j1)
            Worksheets(wh).Cells(i, 2).Copy Destination:=Sheets("Report").Range("b" & j1)
            Worksheets(wh).Cells(i, 6).Copy Destination:=Sheets("Report").Range("c" & j1)
            Worksheets(wh).Cells(i, 10).Copy Destination:=Sheets("Report").Range("d" & j1)
            Worksheets(wh).Cells(i, 4).Copy Destination:=Sheets("Report").Range("g" & j1)
            Call cal
            j1 = j1 + 1
        Next i


So, just set the destination's value instead.

With Worksheets("Report")
        .Range("A" & j1).Value = ws.Cells(i, 1).Value
        .Range("B" & j1).Value = ws.Cells(i, 2).Value
        .Range("C" & j1).Value = ws.Cells(i, 10).Value
        ' etc
    End With


While you're at it, limit the scope of the j1 and j variables. You have them declared at t

Code Snippets

Public s, c, r2, r, i, j, k, i3, i1, i2, r1, j1, j2, i4, r3, sum As Long
Public s As Long, c As Long, r2 As Long ' etc.
Private s As Long
Private c As Long
On Error Resume Next
    Set wsTest = ActiveWorkbook.Worksheets(strSheetName)
On Error GoTo 0
Public Sub Foo()
On Error GoTo ErrHandler

    ' do something that might fail

    Exit Sub

CleanExit:

    Application.EnableEvents = True
    Application.ScreenUpdating = True

ErrHandler:

    ' do something to actually handle the error and Resume Next OR

    ' if you can't recover then

    Resume CleanExit

Context

StackExchange Code Review Q#84182, answer score: 4

Revisions (0)

No revisions yet.