patternpythonMinor
Parsing option symbols
Viewed 0 times
optionparsingsymbols
Problem
I think I have this working, but I'm not sure it's completely accurate.
I'm trying to parse Option Symbols. An Option Symbol is made up of 4 parts:
After parsing the following examples, the results should be as follows:
-
C020216P00035000
-
P020216C00040000
-
SBC020216C00030000
I'm using the following code:
Is this the best way of getting my results? I'm mostly concerned with the nested regex expression for the Option Type.
I'm trying to parse Option Symbols. An Option Symbol is made up of 4 parts:
- Root Symbol (up to 6 characters)
- Expiration Date (yymmdd)
- Option Type (1 character)
- Strike price (8 digits)
After parsing the following examples, the results should be as follows:
-
C020216P00035000
- Root Symbol ='C'
- Expiration Date = datetime.date(2002, 2, 16)
- Option Type = P
- Strike Price = int(00035000) x .001 = 35.00
-
P020216C00040000
- Root Symbol = 'P'
- Expiration Date = datetime.date(2002, 2, 16)
- Option Type = C
- Strike Price = int(00040000) x .001 = 40.00
-
SBC020216C00030000
- Root Symbol = 'SBC'
- Expiration Date = datetime.date(2002, 2, 16)
- Option Type = C
- Strike Price = int(00030000) x .001 = 30.00
I'm using the following code:
import re
import datetime as dt
opra_symbol = re.compile(r'(^[^0-9]+)').search(OPRA).group()
opra_expiry = dt.datetime.strptime(re.compile(r'\d{2}\d{2}\d{2}').search(OPRA).group(), '%y%m%d').date()
opra_cp = re.compile(r'([CP])').search(re.compile(r'([CP]\d+$)').search(OPRA).group()).group()
opra_price = int(re.compile(r'(\d+)
Is this the best way of getting my results? I'm mostly concerned with the nested regex expression for the Option Type.).search(OPRA).group()) * .001Is this the best way of getting my results? I'm mostly concerned with the nested regex expression for the Option Type.
Solution
It looks like you can simplify it to the single regex:
Then symbol, expiry, type and price are in
matcher = re.compile(r'^(.+)([0-9]{6})([PC])([0-9]+)
Then symbol, expiry, type and price are in groups[0], groups[1], groups[2] and groups[3] respectively. The expiry is guaranteed to be in yymmdd format, hence the {6} qualifier. You may want to add a {8} length qualifier to the price.)
groups = matcher.search(option)Then symbol, expiry, type and price are in
groups[0], groups[1], groups[2] and groups[3] respectively. The expiry is guaranteed to be in yymmdd format, hence the {6} qualifier. You may want to add a {8} length qualifier to the price.Code Snippets
matcher = re.compile(r'^(.+)([0-9]{6})([PC])([0-9]+)$')
groups = matcher.search(option)Context
StackExchange Code Review Q#121013, answer score: 6
Revisions (0)
No revisions yet.