snippetpythonMinor
Regex for finding After Effects files in scene code format
Viewed 0 times
afterformateffectsfilesforfindingcoderegexscene
Problem
I haven't used regex a lot and I needed to set up a script that can gather a list of file paths that should adhere to a strict formatting convention, so I thought that sounded like a good opportunity to use them.
To explain a bit, there's a set of sequence folders inside a root folder, and in those sequences is a set of scene folders. In those is a set of constant named folders, the relevant one being "AEP Files". Then in there is a set of after effects files, of which I want to get the highest numbered version which is denoted but
I'm particularly interested to know
I had ideas on more efficiency but they involved collapsing list comprehensions down and I thought that would make an unreadable mess, so I opted not to. Any other feedback you want to give is also welcome!
```
import os
import re
root = r'P:\ProjectName\Scenes'
IGNORE = re.IGNORECASE
folders = [os.path.join(root, f) for f in os.listdir(root)
if re.match(r'e\dq\d', f, IGNORE)]
folders.sort()
scene_folders = [os.path.join(folder, f) for folder in folders
for f in os.listdir(folder)
if re.match(r'e\d_q\d_s\d*', f, IGNORE)]
scene_folders.sort()
scenes = []
missing_scenes = []
for folder in scene_folders:
matches = [re.match(r'proj_q\d_s\d_v(\d*)\.aep', f, IGNORE)
for f in os.listdir(os.path.join(folder, "AEP Files"))
if re.match(r'proj_q\d_s\d_v(\d*)\.aep', f, IGNORE)]
matches = [(match.group(), match.groups()[0]) for match in matches]
if matches:
scenes.append(os.path.join(folder, sorted(matches)[-1][0])
To explain a bit, there's a set of sequence folders inside a root folder, and in those sequences is a set of scene folders. In those is a set of constant named folders, the relevant one being "AEP Files". Then in there is a set of after effects files, of which I want to get the highest numbered version which is denoted but
_v##.aep at the end of the file. A sample path might look like this:P:\ProjectName\Scenes\e10_q04\e10_q04_s12\AEP Files\proj_e10_q04_s12_v04.aepI'm particularly interested to know
- If I'm using regex correctly,
- Whether I should use something other than
if re.match(...)?
- Whether I could make it more efficient (particularly the list comprehensions); and
- Given the complexity, how is the current readability?
I had ideas on more efficiency but they involved collapsing list comprehensions down and I thought that would make an unreadable mess, so I opted not to. Any other feedback you want to give is also welcome!
```
import os
import re
root = r'P:\ProjectName\Scenes'
IGNORE = re.IGNORECASE
folders = [os.path.join(root, f) for f in os.listdir(root)
if re.match(r'e\dq\d', f, IGNORE)]
folders.sort()
scene_folders = [os.path.join(folder, f) for folder in folders
for f in os.listdir(folder)
if re.match(r'e\d_q\d_s\d*', f, IGNORE)]
scene_folders.sort()
scenes = []
missing_scenes = []
for folder in scene_folders:
matches = [re.match(r'proj_q\d_s\d_v(\d*)\.aep', f, IGNORE)
for f in os.listdir(os.path.join(folder, "AEP Files"))
if re.match(r'proj_q\d_s\d_v(\d*)\.aep', f, IGNORE)]
matches = [(match.group(), match.groups()[0]) for match in matches]
if matches:
scenes.append(os.path.join(folder, sorted(matches)[-1][0])
Solution
rootshould be capitalized as it is a constant.
- You can shorten the code by using
sortedthat returns a new list instead of working in place.
IGNOREin my opinion reduces readibility as it is less obvious than IGNORECASE.
- Your regexes are pretty hard, I would extract them into constants and give them a name.
- You make good use of list comprehensions, they shorten and simplify code, good job!
- Performance may not be a primary concern here, but it is a good habit to use lazy generators: putting round parenthesis instead of square brackets outside the
matchcomprehension will not build the list, saving memory and time. The same goes forscene_floders.
Context
StackExchange Code Review Q#102484, answer score: 4
Revisions (0)
No revisions yet.