snippetpythonMinor
Convert international datestring to ISO-format
Viewed 0 times
formatdatestringconvertinternationaliso
Problem
The function below takes a datestring in the form d-mmm-yyyy and converts it to ISO date format (yyyy-mm-dd). Delimiters may be hyphen, space or
Now, I know there is
So I wrote my own.
The regex match parses date month and year may not be pretty, but it works and it's easy to expand for other patterns. Likewise for the month number lookup with
Instead of the code between
This works too, but
/, and Dutch or English month abbreviations may be used. Now, I know there is
dateutil, but it returns unknown string format if you try to parse something with a non-English month in it. I haven't digested all its documentation, but I think dateutil is mainly intended for date calculation. I'm not doing that, I'm just cleaning up user input. So I wrote my own.
import re
.
.
def ISOdate(date):
'''
converts the following date string format to ISO (yyyy-mm-dd):
28-okt-1924 (dutch month abbreviations)
28 oct 1924 (english..)
9/nov/2012 (single digit)
'''
shortmonths = [
'jan', 'feb', 'mrt', 'apr', 'mei', 'jun',
'jul', 'aug', 'sep', 'okt', 'nov', 'dec',
'jan', 'feb', 'mar', 'apr', 'may', 'jun',
'jul', 'aug', 'sep', 'oct', 'nov', 'dec'
]
# Month abbrevs are only different march, may and october.
pat = r'(\d{1,2})\s?[-\/]?\s?(\w{3})\s?[-\/]?\s?(\d{4})'
q = re.match(pat, date)
if q:
year = q.group(3)
day = int(q.group(1))
month = shortmonths.index(q.group(2).lower()) % 12 + 1
return u'{}-{:02d}-{:02d}'.format(year, month, day)
else:
# just return input, date fields may be empty
return dateThe regex match parses date month and year may not be pretty, but it works and it's easy to expand for other patterns. Likewise for the month number lookup with
index, which is more concise than a chain of if, elif to match month strings to numbers. Instead of the code between
if q: and else:, I also had this, which uses datetime:year = int(q.group(3))
day = int(q.group(1))
month = shortmonths.index(q.group(2).lower()) % 12 + 1
d = datetime.datetime(year, month, day)
return u'{:%YY-%m-%d}'.format(d)This works too, but
Solution
Neat stuff.
Suggestions:
Changing
Pythonic: unpack
Use
Suggestions:
Changing
shortmonths to a dictionary. This will allow for a pair between numerical months and alphabetical months. No need to repeat 'jan' for example, as you have it now.Pythonic: unpack
month, year, day in a one liner. Use
datetime's strftime to format dates...makes life easier in case you want to change the format down the road.import re
import datetime
def ISOdate(date):
month_d = {'01': 'jan',
'02': 'feb',
'03': ['mar', 'mrt'],
'04': 'apr',
'05': ['may', 'mei'],
'06': 'jun',
'07': 'jul',
'08': 'aug',
'09': 'sep',
'10': ['oct', 'okt'],
'11': 'nov',
'12': 'dec'
}
pat = r'(\d{1,2})\s?[-\/]?\s?(\w{3})\s?[-\/]?\s?(\d{4})'
q = re.match(pat, date)
if q:
day, month, year = [q.group(idx+1) for idx in range(3)]
if month.isalpha(): # change from letters to numbers
month = [k for k, v in month_d.items() if month in v][0]
out_date = datetime.date(int(year), int(month), int(day))
return datetime.datetime.strftime(out_date, '%Y-%m-%d')
else:
return dateCode Snippets
import re
import datetime
def ISOdate(date):
month_d = {'01': 'jan',
'02': 'feb',
'03': ['mar', 'mrt'],
'04': 'apr',
'05': ['may', 'mei'],
'06': 'jun',
'07': 'jul',
'08': 'aug',
'09': 'sep',
'10': ['oct', 'okt'],
'11': 'nov',
'12': 'dec'
}
pat = r'(\d{1,2})\s?[-\/]?\s?(\w{3})\s?[-\/]?\s?(\d{4})'
q = re.match(pat, date)
if q:
day, month, year = [q.group(idx+1) for idx in range(3)]
if month.isalpha(): # change from letters to numbers
month = [k for k, v in month_d.items() if month in v][0]
out_date = datetime.date(int(year), int(month), int(day))
return datetime.datetime.strftime(out_date, '%Y-%m-%d')
else:
return dateContext
StackExchange Code Review Q#57841, answer score: 2
Revisions (0)
No revisions yet.