HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Excel Retrieving Data from website through Internet Explorer

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
websiteexcelexplorerinternetthroughretrievingfromdata

Problem

I took a look at Speed up processing between VBA and IE, but I didn't see anything that addressed the speeding up of the retrieval of information via IE.

Since MS got rid of the stock retrieval database (MSN Money), there wasn't an easy way to get current stock prices. From what I gather, there are some add-ins available through the app store, but like usual, I'm stuck on 2007 doing this.

This was my attempt at a UDF for retrieving the stock price from http://www.msn.com/en-us/money/ which requires finding elements on the page and getting the value from within. It's.. not very elegant, but it works. What can I do to speed up the IE processes? I know, I know - screenupdating, etc

Option Explicit

Public Function GetTickerPrice(ByVal ticker As String) As Double

    Dim IE As Object
    Dim pageData As Object

    Set IE = CreateObject("InternetExplorer.Application")
    Dim URL As String
    URL = "http://www.msn.com/en-us/money/stockdetails/fi-126.1." & ticker & ".NAS?symbol=" & ticker & "=PRFIMQ"

    IE.Navigate URL

     Do Until IE.ReadyState >= 4
            DoEvents
     Loop

     Application.Wait Now + TimeSerial(0, 0, 5)

     Set pageData = IE.document

     GetTickerPrice = pageData.getElementsByClassName("precurrentvalue")(0).innertext

     Set IE = Nothing
     Set pageData = Nothing
End Function

Solution

Worksheet functions were not designed for this. User-defined worksheet functions were not made a feature of Excel to go and fetch data online - in my humble opinion, this is a terrible abuse of UDF's.

A function like this should be part of the definition of a class module that defines an object responsible for this task, and wrapped with an interface, say ITickerPriceProvider:

Public Function GetTickerPrice(ByVal ticker As String) As Double
End Function


Then your code moves into a class module with Implements ITickerPriceProvider, and the signature becomes this:

Option Explicit
Implements ITickerPriceProvider

Private Function ITickerPriceProvider_GetTickerPrice(ByVal ticker As String) As Double
    'your code
End Function


What gives? Now you can get fancy and write a macro that looks like this:

Public Sub UpdateTickerPrices()

    Dim provider As ITickerPriceProvider
    Set provider = New WebTickerPriceProvider

    Dim updater As TickerSheetUpdater
    Set updater = New TickerSheetUpdater

    updater.UpdateTickerPrices provider, TickerPricesSheet

End Sub


And the TickerSheetUpdater.UpdateTickerPrices method is where you go and implement the code that locates ticker strings (your worksheet must have a column with those, right?), fetches the prices for each one, and updates the worksheet.

This is a job for a macro, not a UDF.

As a bonus, the TickerSheetUpdater class can be unit-tested because the dependencies are under control - and now if somebody cuts the network cable, you can handle errors gracefully, instead of having 200 UDF calls blowing up one after another.

As for the implementation itself, I don't see any glaring issues. I would have named IE something like browser, URL would have been url, and there would have been an On Error GoTo CleanFail statement at the top, and a CleanExit label to ensure resources get cleaned up whether or not an error occurs:

Private Function ITickerPriceProvider_GetTickerPrice(ByVal ticker As String) As Double
    On Error GoTo CleanFail

    'implementation here

CleanExit:
    Set browser = Nothing
    Set pageData = Nothing
    Exit Function

CleanFail:
    'handle errors
    Resume CleanExit

End Function


Turning ScreenUpdating off isn't going to help much, at least not with a UDF approach - the function isn't updating any cell values, it is a cell value.

However I'd try to see if there wouldn't be a way to remove this:

Application.Wait Now + TimeSerial(0, 0, 5)


...or add a comment that explains why it's needed, and why it needs to be a whole 5 seconds - especially given you've just waited for a specific ReadyState on your IE object:

Do Until IE.ReadyState >= 4
        DoEvents
 Loop


What's 4 anyway? Make an Enum for the possible values, and use it instead of hard-coding a value like this. Why do you need to wait 5 seconds after the browser gets into that ReadyState? Could you keep loop-waiting for a later state value? An enum would help understanding what's going on and why here.

Code Snippets

Public Function GetTickerPrice(ByVal ticker As String) As Double
End Function
Option Explicit
Implements ITickerPriceProvider

Private Function ITickerPriceProvider_GetTickerPrice(ByVal ticker As String) As Double
    'your code
End Function
Public Sub UpdateTickerPrices()

    Dim provider As ITickerPriceProvider
    Set provider = New WebTickerPriceProvider

    Dim updater As TickerSheetUpdater
    Set updater = New TickerSheetUpdater

    updater.UpdateTickerPrices provider, TickerPricesSheet

End Sub
Private Function ITickerPriceProvider_GetTickerPrice(ByVal ticker As String) As Double
    On Error GoTo CleanFail

    'implementation here

CleanExit:
    Set browser = Nothing
    Set pageData = Nothing
    Exit Function

CleanFail:
    'handle errors
    Resume CleanExit

End Function
Application.Wait Now + TimeSerial(0, 0, 5)

Context

StackExchange Code Review Q#121685, answer score: 5

Revisions (0)

No revisions yet.