patternMinor
Web scraping VBA and VB Script
Viewed 0 times
scriptscrapingwebvbaand
Problem
I am working on a project on VBA where the objective is to have a "program" that fetches rates from a website called X-Rates, and outputs to excel the monthly averages of a chosen country.
Initially I was doing simple XMLHTTP requests and output to Excel the results.
But now I have tried to develop a "pseudo" multithread excel web scraper, based on Daniel Ferry's article on excelhero.com
Since I don't know much about VBScript, I think there's enough room for improvement, so I ask please review my code!
Apologies for long code!
Global Variables:
Format the sheet on workbook:
Add headers:
Clear & Check for contents in resu
Initially I was doing simple XMLHTTP requests and output to Excel the results.
But now I have tried to develop a "pseudo" multithread excel web scraper, based on Daniel Ferry's article on excelhero.com
Since I don't know much about VBScript, I think there's enough room for improvement, so I ask please review my code!
Apologies for long code!
Global Variables:
Option Explicit
Public Const baseURLpart1 As String = "http://www.x-rates.com/average/?from="
Public Const baseURLpart2 As String = "&to="
Public Const baseURLpart3 As String = "&amount=1&year="
Public Const ARS As String = "ARS"
Public Const AUD As String = "AUD"
Public Const BRL As String = "BRL"
Public Const CNY As String = "CNY"
Public Const EUR As String = "EUR"
Public Const GBP As String = "GBP"
Public Const JPY As String = "JPY"
Public Const MXN As String = "MXN"
Public Const USD As String = "USD"Format the sheet on workbook:
Private Sub FormatResultSheet()
'We will center the cells to give a better readability of results and format as text to keep all zeros . Ex: 1.000000
Dim TargetRange As Range
Set TargetRange = ResultSheet.Range("A:F")
TargetRange.HorizontalAlignment = xlCenter
TargetRange.NumberFormat = "@"
End SubAdd headers:
Private Sub AddHeader()
'Header cells exist to represent what values are extracted in what columns and are "styled" to stand out for better readability
Dim arr(1 To 6) As String
arr(1) = "Year"
arr(2) = "OffSetCurr"
arr(3) = "Month"
arr(4) = "toEuro"
arr(5) = "toDollars"
arr(6) = "toPounds"
ResultSheet.Range("A1:F1") = arr()
With ResultSheet
.Range("A1", "F1").Style = "Input"
.Range("A1", "F1").Font.Bold = True
End With
End SubClear & Check for contents in resu
Solution
I think the URL "parts" could be better defined:
You're only ever using them in one single place. Consider reducing the scope to the
Then instead of concatenating parts, you replace markers with their value:
I took the liberty to rename the hard-to-read
Notice we're both using a Hungarian Notation here - except you're using it to identify the type of variables (
I see you're using a lot of
Actually, this looks like a job for a
You're assuming that the
The
I see you're jumping around:
Why? First, when you have two conditions that end up with the same identical result, you should be combining them; and then, when there's only one single instruction in an
But that doesn't fix the jumping around - you don't need any
It's not clear why you have this huge chunk of copy+pasta'd code near the bottom of the procedure either; VBA isn't VBScript, it doesn't have to look like a script - meaning, you can (should!) split the functionality into smaller procedures that do as little as possible! DRY / Don't Repeat Yourself!
Just noticed this loop:
```
For Each sourceCurrency In Array(EUR, USD, GBP)
If sourceCurrency = USD Then
rowNumber = 2
columnNumber = columnNumber + 1
End If
If sourceCurrency = GBP Then
rowNumber = 2
columnNumber = columnNumber + 1
End If
Call CreateVBAgentCode((rowNumber), (columnNumber), (ARS), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (AUD), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (BRL), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (CNY), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (EUR), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (GBP), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber
Public Const baseURLpart1 As String = "http://www.x-rates.com/average/?from="
Public Const baseURLpart2 As String = "&to="
Public Const baseURLpart3 As String = "&amount=1&year="You're only ever using them in one single place. Consider reducing the scope to the
CreateVBAgentCode procedure, and making the URL a single, templated string, perhaps similar to this:Const urlTemplate As String = "http://www.x-rates.com/average/?from=%FROM%&to=%TO%&amount=%AMOUNT%&year=%YEAR%"Then instead of concatenating parts, you replace markers with their value:
Dim url As String
url = Replace(urlTemplate, "%FROM%", queryFromCode)
url = Replace(url, "%TO%", queryToCode)
url = Replace(url, "%AMOUNT%", 1)
url = Replace(url, "%YEAR%", queryYear)I took the liberty to rename the hard-to-read
sFromCrcy, sToCrcy and sYear variables with more meaningful and pronounceable names. Best avoid disemvoweling identifiers too: I think Crcy stands for "Currency" (haven't looked at the call site yet), but then knowing it's a currency code I'd just go with Code and call it a day.Notice we're both using a Hungarian Notation here - except you're using it to identify the type of variables (
s => String, right?), which is useless at best, and irritating at worst. I've prefixed all these parameters with query, to indicate that they're being used as part of a query string - and that provides much more value than "hey look, that thing you have a 95% chance of misspelling every time you refer to it, is a string!".I see you're using a lot of
vbCrLf, which is Windows line endings - I would prefer vbNewLine instead, which is OS-sensitive and will work just as well on a Mac.Actually, this looks like a job for a
StringBuilder - consider using this implementation (make sure you read the reviews, too!), so instead of constantly concatenating s, your code could look like this (note, the StringBuilder has no AppendLine method, but you could easily add one):Dim script As String
With New StringBuilder
.AppendLine "Dim oXML, oXL, rowNumber, columnNumber, sContent, i, mtchCnt, subMtchCnt, sResults, sURL"
.AppendLine "Dim vResults(9)"
.AppendLine
.AppendLine
.AppendLine
.AppendLine "' Setup variables"
.AppendLine "rowNumber = " & rowNumber
.AppendLine "columnNumber = " & columnNumber
'...
script = .ToString
End WithYou're assuming that the
\Users folder is under the C: drive. That's usually not a bad assumption to make, but you could be using this instead:path = Environ$("TEMP") & "\SwarmAgent_" & agentNumber & ".vbs"The
LOCALAPPDATA environment variable returns the full path of the \Users\{username}\AppData\Local folder, and the TEMP environment variable returns that \Temp folder: no need to hard-code any part of it.I see you're jumping around:
If sToCrcy = USD Then
GoTo WriteFiles
End If
If sToCrcy = GBP Then
GoTo WriteFiles
End IfWhy? First, when you have two conditions that end up with the same identical result, you should be combining them; and then, when there's only one single instruction in an
If block, you can inline it:If sToCrcy = USD Or sToCrcy = GBP Then GoTo WriteFilesBut that doesn't fix the jumping around - you don't need any
GoTo jumps.It's not clear why you have this huge chunk of copy+pasta'd code near the bottom of the procedure either; VBA isn't VBScript, it doesn't have to look like a script - meaning, you can (should!) split the functionality into smaller procedures that do as little as possible! DRY / Don't Repeat Yourself!
Just noticed this loop:
```
For Each sourceCurrency In Array(EUR, USD, GBP)
If sourceCurrency = USD Then
rowNumber = 2
columnNumber = columnNumber + 1
End If
If sourceCurrency = GBP Then
rowNumber = 2
columnNumber = columnNumber + 1
End If
Call CreateVBAgentCode((rowNumber), (columnNumber), (ARS), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (AUD), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (BRL), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (CNY), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (EUR), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber = rowNumber + 12
Call CreateVBAgentCode((rowNumber), (columnNumber), (GBP), (sourceCurrency), (sYear), (agentNumber))
agentNumber = agentNumber + 1
rowNumber
Code Snippets
Public Const baseURLpart1 As String = "http://www.x-rates.com/average/?from="
Public Const baseURLpart2 As String = "&to="
Public Const baseURLpart3 As String = "&amount=1&year="Const urlTemplate As String = "http://www.x-rates.com/average/?from=%FROM%&to=%TO%&amount=%AMOUNT%&year=%YEAR%"Dim url As String
url = Replace(urlTemplate, "%FROM%", queryFromCode)
url = Replace(url, "%TO%", queryToCode)
url = Replace(url, "%AMOUNT%", 1)
url = Replace(url, "%YEAR%", queryYear)Dim script As String
With New StringBuilder
.AppendLine "Dim oXML, oXL, rowNumber, columnNumber, sContent, i, mtchCnt, subMtchCnt, sResults, sURL"
.AppendLine "Dim vResults(9)"
.AppendLine
.AppendLine
.AppendLine
.AppendLine "' Setup variables"
.AppendLine "rowNumber = " & rowNumber
.AppendLine "columnNumber = " & columnNumber
'...
script = .ToString
End Withpath = Environ$("TEMP") & "\SwarmAgent_" & agentNumber & ".vbs"Context
StackExchange Code Review Q#135513, answer score: 5
Revisions (0)
No revisions yet.