patternpythonMinor
Rewrite Amazon s3 key
Viewed 0 times
amazonrewritekey
Problem
I've create a function that rewrites the key or "path" of an object in s3.
By default, Amazon Web Services Firehose writes to s3 in the format of
input key:
output key:
I'm newer to python and am just looking for ways to improve my code as it seems fragile.
EDIT: The input key can vary in its number of paths:
By default, Amazon Web Services Firehose writes to s3 in the format of
YYYY/MM/DD/HH/foo.json. We have a AWS Lambda function listening for putObjects on l1/source/event_type/fh/, and when a new file is added to s3, the Lambda is invoked and the key or 'path' to that file is rewritten to a flat structure of l1/source/event_type/daily/dt=YYYY-MM-DD/foo.json - yes, I purposefully left-off the fh and HH paths.input key:
l1/source/event_type/fh/YYYY/MM/DD/HH/foo.jsonoutput key:
l1/source/event_type/daily/dt=YYYY-MM-DD/foo.jsondef create_date_parition_from_key(key):
'''creates new date parition prefix
'''
try:
key_split = re.split(r'(/\d{4})', key)
start_path = (key_split[0].split('/')[0], key_split[0].split('/')[1])
remove_fh_path = '/'.join(start_path)
default = key_split[2].split('/')
year = key_split[1][1:]
s3_prefix = remove_fh_path + '/'# /l1/foo/bar/baz/
date_partion = ('daily/dt=' +
'-'.join([year, default[1], default[2]]) +
'/') # dt=YYYY-MM-DD/
file_name = default[-1] # foo.json
new_key = s3_prefix + date_partion + file_name
print ('New partition key created: {}.'.format(new_key))
return new_key
except Exception as ex:
print(ex)
print('Error paritioning key {}.'.format(key))
raise exI'm newer to python and am just looking for ways to improve my code as it seems fragile.
EDIT: The input key can vary in its number of paths:
l1/source/event_type/fh/YYYY/MM/DD/HH/foo.json
l1/app/source/event_type/fh/YYYY/MM/DD/HH/foo.json
l1/event_type/fh/YYYY/MM/DD/HH/foo.json
Solution
Simplicity
Tuple unpacking and
If input key can change in the number of paths:
Exception handling
This is against re-use. I cannot re-use this function because it inevitably prints to the console on error, I have no way of working around it throwing an exception with
This kind of error notification should be done in the
Finally
I just removed all exception handling in my version as in my opinion it was just making the code worse while there was no need for it.
Tuple unpacking and
str.format can simplify the function so much:def create_date_parition_from_key(key):
a,b,c,_,year, month, day, _, name = key.split('/')
return "{}/{}/{}/daily/dt={}-{}-{}/{}".format(\
a, b, c, year, month, day, name)If input key can change in the number of paths:
def create_date_parition_from_key(key):
*_, year, month, day, _, name = key.split('/')
return "{}/daily/dt={}-{}-{}/{}".format(\
key[:key.index("/fh/")], year, month, day, name)Exception handling
try:
# Things
except Exception as ex:
print(ex)
print('Error paritioning key {}.'.format(key))
raise exThis is against re-use. I cannot re-use this function because it inevitably prints to the console on error, I have no way of working around it throwing an exception with
try - except.This kind of error notification should be done in the
main function that has exactly the job of communicating errors / successes of the other functions to the end user.Finally
Exception is too vague and too much code is inside the try block. Please reduce the code inside try as much as possible and specify a precise Exception kind.I just removed all exception handling in my version as in my opinion it was just making the code worse while there was no need for it.
Code Snippets
def create_date_parition_from_key(key):
a,b,c,_,year, month, day, _, name = key.split('/')
return "{}/{}/{}/daily/dt={}-{}-{}/{}".format(\
a, b, c, year, month, day, name)def create_date_parition_from_key(key):
*_, year, month, day, _, name = key.split('/')
return "{}/daily/dt={}-{}-{}/{}".format(\
key[:key.index("/fh/")], year, month, day, name)try:
# Things
except Exception as ex:
print(ex)
print('Error paritioning key {}.'.format(key))
raise exContext
StackExchange Code Review Q#142263, answer score: 4
Revisions (0)
No revisions yet.