patternMinor
Most efficient multi-find/replace solution in Excel VBA
Viewed 0 times
multiexcelefficientreplacefindsolutionvbamost
Problem
I'm looking to improve the runtime and efficiency of my VBA code that performs multiple
Note: Some values/symbols did not show up correctly in this first piece of code
(blank values displayed in 'What:=""' and the ">" and "" and "
``
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Opening Single Quote
Selection.Replace What:="", Replacement:="'", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Double Open Quotes
Selection.Replace What:="", Replacement:="""", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Double Closing Quotes
Selection.Replace What:="", Replacement:="""", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Dash
Selection.Replace What:="", Replacement:="-", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Registered Trademark (R)
Selection.Replace What:="®", Replacement:="(R)", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Trademark (TM)
Selection.Replace What:="", Replacement:="(TM)", LookAt:=xlPart, _
SearchOrder:=xlByRow
Replace operations on an Excel Spreadsheet. I started out with the following (please forgive me for this horrendous mess):Note: Some values/symbols did not show up correctly in this first piece of code
(blank values displayed in 'What:=""' and the ">" and "" and "
``
Private Sub symbolCheck()
Range("A2").Select
'Selects only to the end of all the data in the file to reduce processing time
Range(Selection, Selection.SpecialCells(xlLastCell)).Select
'Proceed with removing invalid symbols
'Apostrophe/Closing Single Quote
Selection.Replace What:="", Replacement:="'", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Apostrophe
Selection.Replace What:="", Replacement:="'", LookAt:=xlPart, _SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Opening Single Quote
Selection.Replace What:="", Replacement:="'", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Double Open Quotes
Selection.Replace What:="", Replacement:="""", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Double Closing Quotes
Selection.Replace What:="", Replacement:="""", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Dash
Selection.Replace What:="", Replacement:="-", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Registered Trademark (R)
Selection.Replace What:="®", Replacement:="(R)", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
'Trademark (TM)
Selection.Replace What:="", Replacement:="(TM)", LookAt:=xlPart, _
SearchOrder:=xlByRow
Solution
I don't know about the performance (looks like pretty intensive work anyway), but code-wise, it can definitely improve!
You're repeating the
Start with renaming
When you have a new symbol to replace, you just
The benefit here isn't one of performance, it's one of maintainability and readability: instead of reading like a macro-recorder script, your code starts reading more naturally, like actual code meant to be read and maintained, not just executed.
Notice I'm checking if the
So you are scanning something like 2,250,000 cells 20 times; it completing in 26-28 seconds means just a little more than 1 second per iteration is spent searching/replacing across 2.25 million cells: I wouldn't call that inefficient, but skipping no-op iterations that replace a value by the same value (looking at "" in particular), and avoiding calls to
Also, I wouldn't rely on
You're repeating the
.Replace call as many times as you have things to replace. Extract it into its own method, and separate the concerns of "knowing what to look for" and "replacing stuff".Start with renaming
r1 to a meaningful name, adding Option Explicit at the top of your module, and referencing the Microsoft Scripting Runtime library. Then do this:Dim searchRange As Range
Set searchRange = ActiveSheet.UsedRange
Dim replacements As Dictionary
Set replacements = New Dictionary
With replacements
.Add "’", "'"
.Add "`", "'"
.Add "‘", "'"
.Add "“", """"
.Add "”", """"
.Add "–", "-"
.Add "®", "(R)"
.Add "™", "(TM)"
.Add "°", " degrees"
.Add "×", "x"
.Add "¿", vbNullString
.Add "•", vbNullString
.Add "…", "..."
.Add "€", vbNullString
.Add "|", ","
.Add "", ">"
.Add "½", " 1/2"
.Add "¾", " 3/4"
.Add "¼", " 1/4"
End WithWhen you have a new symbol to replace, you just
.Add it and its replacement to that Dictionary, and now you have a single data structure to iterate, contrary to the "dual array" suggestion you've come across - whenever you have two arrays with indices lined up and a requirement for them to "keep in sync", you're using the wrong data structure.The benefit here isn't one of performance, it's one of maintainability and readability: instead of reading like a macro-recorder script, your code starts reading more naturally, like actual code meant to be read and maintained, not just executed.
Dim key As String
Dim value As String
For Each key In replacements.Keys
value = replacements(key)
If Asc(key) <> Asc(value) Then
searchRange.Replace What:=key, Replacement:=value, LookAt:=xlPart
Else
Debug.Print "Extraneous key '" & key & "' could be removed from dictionary."
End If
NextNotice I'm checking if the
key matches the value, because I noticed you have a number of superfluous replacements there, doing work for nothing.So you are scanning something like 2,250,000 cells 20 times; it completing in 26-28 seconds means just a little more than 1 second per iteration is spent searching/replacing across 2.25 million cells: I wouldn't call that inefficient, but skipping no-op iterations that replace a value by the same value (looking at "" in particular), and avoiding calls to
Chr(), could possibly trim another 1-3 seconds from it.Also, I wouldn't rely on
UsedRange, as it tends to not always match the range that you're interested in: if that's the case, you could skip a few thousand iterations and shave off another couple of seconds by working with the actually used range. See this SO post.Code Snippets
Dim searchRange As Range
Set searchRange = ActiveSheet.UsedRange
Dim replacements As Dictionary
Set replacements = New Dictionary
With replacements
.Add "’", "'"
.Add "`", "'"
.Add "‘", "'"
.Add "“", """"
.Add "”", """"
.Add "–", "-"
.Add "®", "(R)"
.Add "™", "(TM)"
.Add "°", " degrees"
.Add "×", "x"
.Add "¿", vbNullString
.Add "•", vbNullString
.Add "…", "..."
.Add "€", vbNullString
.Add "|", ","
.Add "<", "<"
.Add ">", ">"
.Add "½", " 1/2"
.Add "¾", " 3/4"
.Add "¼", " 1/4"
End WithDim key As String
Dim value As String
For Each key In replacements.Keys
value = replacements(key)
If Asc(key) <> Asc(value) Then
searchRange.Replace What:=key, Replacement:=value, LookAt:=xlPart
Else
Debug.Print "Extraneous key '" & key & "' could be removed from dictionary."
End If
NextContext
StackExchange Code Review Q#120202, answer score: 6
Revisions (0)
No revisions yet.