patternpythonflaskMinor
Catching API changes for a service that doesn't version its API
Viewed 0 times
versioncatchingitsservicechangesforthatdoesnapi
Problem
I'm using the old Google Data API to access several spreadsheets and put their contents in a database using Flask and SQLAlchemy. Each run deletes the contents of the database tables and repopulates them with results of the API call.
I'm trying to detect API changes and abort the operation to prevent leaving the tables empty or filling them with garbage. I do this first by checking the status code of the API request. If it's anything other than 200 I abort the operation. To catch when the URL is valid but the JSON result has a new format, I catch
Each time one of these conditions occurs I rollback the database session and exit the function with
I'm worried that my detection coverage is spotty and I'm left feeling there should be a better way...
```
def get_google_content(key, db_table):
if DEBUG:
print 'Fetching content from Google Drive'
spreadsheet_url = ('https://spreadsheets.google.com/feeds/worksheets/'
'{}/public/basic?alt=json'.format(key))
spreadsheet_req = requests.get(spreadsheet_url)
if spreadsheet_req.status_code != 200:
print 'Failed to retrieve spreadsheet from {}'.format(spreadsheet_url)
print 'Google Spreadsheet returned {}'.format(
spreadsheet_req.status_code)
db.session.rollback()
return 1
sheet_ids = list(OrderedDict.fromkeys(re.findall(r'/public/basic/(\w*)',
spreadsheet_req.text, flags=0)))
# The only trivial way to catch removed spreadsheet data is to delete the
# table and fill it all out again
db.session.query(db_table).delete()
for sheet_id in sheet_ids:
worksheet_url = ('https://spreadsheets.google.com/feeds/list/{}/{}/'
'public/values?alt=json'.format(key, sheet_id))
worksheet_req = requests.get(worksheet_url)
if worksheet_req.status_code != 200:
print 'Failed to retrieve spreadsheet from {}'.for
I'm trying to detect API changes and abort the operation to prevent leaving the tables empty or filling them with garbage. I do this first by checking the status code of the API request. If it's anything other than 200 I abort the operation. To catch when the URL is valid but the JSON result has a new format, I catch
KeyErrors when parsing the JSON.Each time one of these conditions occurs I rollback the database session and exit the function with
1.I'm worried that my detection coverage is spotty and I'm left feeling there should be a better way...
```
def get_google_content(key, db_table):
if DEBUG:
print 'Fetching content from Google Drive'
spreadsheet_url = ('https://spreadsheets.google.com/feeds/worksheets/'
'{}/public/basic?alt=json'.format(key))
spreadsheet_req = requests.get(spreadsheet_url)
if spreadsheet_req.status_code != 200:
print 'Failed to retrieve spreadsheet from {}'.format(spreadsheet_url)
print 'Google Spreadsheet returned {}'.format(
spreadsheet_req.status_code)
db.session.rollback()
return 1
sheet_ids = list(OrderedDict.fromkeys(re.findall(r'/public/basic/(\w*)',
spreadsheet_req.text, flags=0)))
# The only trivial way to catch removed spreadsheet data is to delete the
# table and fill it all out again
db.session.query(db_table).delete()
for sheet_id in sheet_ids:
worksheet_url = ('https://spreadsheets.google.com/feeds/list/{}/{}/'
'public/values?alt=json'.format(key, sheet_id))
worksheet_req = requests.get(worksheet_url)
if worksheet_req.status_code != 200:
print 'Failed to retrieve spreadsheet from {}'.for
Solution
Some comments:
Regarding ways to make the
There are other strategies, but that depends on different things like whether you want to stop on the first failure or not.
I hope this helps.
- Instead of using
printstatements, theloggingmodule could be used
- The return value of
get_google_contentshould be a boolean value (Trueon success,Falseon failure)
Regarding ways to make the
refresh_content method a little bit more pythonic, I would probably use a for loop and the all built-in function to make sure that all the calls to get_google_content were successful.@app.route('/admin/refresh_content')
def refresh_content():
spreadsheet_keys = ['key_1', 'key_2', 'key_3']
tables = [Table1, Table2, Table3]
sheet_statuses = [
get_google_content(spreadsheet_key, table)
for spreadsheet_key, table
in zip(spreadsheet_keys, tables)
]
if all(sheet_statuses):
return 'SuccessUpdated spreadsheets'
else:
return 'FailureUnable to update all spreadsheets'There are other strategies, but that depends on different things like whether you want to stop on the first failure or not.
I hope this helps.
Code Snippets
@app.route('/admin/refresh_content')
def refresh_content():
spreadsheet_keys = ['key_1', 'key_2', 'key_3']
tables = [Table1, Table2, Table3]
sheet_statuses = [
get_google_content(spreadsheet_key, table)
for spreadsheet_key, table
in zip(spreadsheet_keys, tables)
]
if all(sheet_statuses):
return '<h1>Success</h1><br>Updated spreadsheets'
else:
return '<h1>Failure</h1><br>Unable to update all spreadsheets'Context
StackExchange Code Review Q#58143, answer score: 2
Revisions (0)
No revisions yet.