HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonflaskMinor

Catching API changes for a service that doesn't version its API

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
versioncatchingitsservicechangesforthatdoesnapi

Problem

I'm using the old Google Data API to access several spreadsheets and put their contents in a database using Flask and SQLAlchemy. Each run deletes the contents of the database tables and repopulates them with results of the API call.

I'm trying to detect API changes and abort the operation to prevent leaving the tables empty or filling them with garbage. I do this first by checking the status code of the API request. If it's anything other than 200 I abort the operation. To catch when the URL is valid but the JSON result has a new format, I catch KeyErrors when parsing the JSON.

Each time one of these conditions occurs I rollback the database session and exit the function with 1.

I'm worried that my detection coverage is spotty and I'm left feeling there should be a better way...

```
def get_google_content(key, db_table):
if DEBUG:
print 'Fetching content from Google Drive'
spreadsheet_url = ('https://spreadsheets.google.com/feeds/worksheets/'
'{}/public/basic?alt=json'.format(key))
spreadsheet_req = requests.get(spreadsheet_url)
if spreadsheet_req.status_code != 200:
print 'Failed to retrieve spreadsheet from {}'.format(spreadsheet_url)
print 'Google Spreadsheet returned {}'.format(
spreadsheet_req.status_code)
db.session.rollback()
return 1

sheet_ids = list(OrderedDict.fromkeys(re.findall(r'/public/basic/(\w*)',
spreadsheet_req.text, flags=0)))

# The only trivial way to catch removed spreadsheet data is to delete the
# table and fill it all out again
db.session.query(db_table).delete()

for sheet_id in sheet_ids:
worksheet_url = ('https://spreadsheets.google.com/feeds/list/{}/{}/'
'public/values?alt=json'.format(key, sheet_id))
worksheet_req = requests.get(worksheet_url)
if worksheet_req.status_code != 200:
print 'Failed to retrieve spreadsheet from {}'.for

Solution

Some comments:

  • Instead of using print statements, the logging module could be used



  • The return value of get_google_content should be a boolean value (True on success, False on failure)



Regarding ways to make the refresh_content method a little bit more pythonic, I would probably use a for loop and the all built-in function to make sure that all the calls to get_google_content were successful.

@app.route('/admin/refresh_content')
def refresh_content():
    spreadsheet_keys = ['key_1', 'key_2', 'key_3']
    tables = [Table1, Table2, Table3]
    sheet_statuses = [
      get_google_content(spreadsheet_key, table)
      for spreadsheet_key, table
      in zip(spreadsheet_keys, tables)
    ]

    if all(sheet_statuses):
        return 'SuccessUpdated spreadsheets'
    else:
        return 'FailureUnable to update all spreadsheets'


There are other strategies, but that depends on different things like whether you want to stop on the first failure or not.

I hope this helps.

Code Snippets

@app.route('/admin/refresh_content')
def refresh_content():
    spreadsheet_keys = ['key_1', 'key_2', 'key_3']
    tables = [Table1, Table2, Table3]
    sheet_statuses = [
      get_google_content(spreadsheet_key, table)
      for spreadsheet_key, table
      in zip(spreadsheet_keys, tables)
    ]

    if all(sheet_statuses):
        return '<h1>Success</h1><br>Updated spreadsheets'
    else:
        return '<h1>Failure</h1><br>Unable to update all spreadsheets'

Context

StackExchange Code Review Q#58143, answer score: 2

Revisions (0)

No revisions yet.