principlepythonMinor
"Compare" program for Eclipse preference files
Viewed 0 times
eclipseprogramfilesforcomparepreference
Problem
I am trying to write a simple (trivial?) "compare" program for Eclipse preferences files.
Eclipse preferences files take more of less this form:
Let's call the path sequences "keys" and what follows the
The rules of the program should be:
The output should be as follows:
Part 1
-
All lines from argument1 which have keys not present in argument2, like so:
-
Blank line
Part 2
-
All keys that are present in both argument1 and argument2, like so:
-
Blank line
Part 3
-
All lines from
Notice Part 1, Part 2, and Part 3 must all be sorted.
I would like to get something:
Here's my attempt:
```
#!/usr/bin/env /opt/local/bin/python2.7
import sys
import os
import re
import pprint
COMMAND_SYNTAX_ERROR = 2
EMPTY_PREFS_FILE_ERROR = 3
PREFS_REGEX_PATTERN = '^(.?)=(.)$'
PREFS_REGEX = re.compile(PREFS_REGEX_PATTERN)
def parse_prefs_line(line):
regex_result = PREFS_REGEX.match(line)
if not regex_result:
return None, None
return regex_result.group(1), regex_result.group(2)
arguments = sys.argv
n_arguments = len(arguments)
if n_arguments != 3:
print 'usage: e
Eclipse preferences files take more of less this form:
# optional comment line
/a/sequence/of/path/elements=string
/another/sequence/of/path/elements=42
# ... (possibly repeated)Let's call the path sequences "keys" and what follows the
= sign "values".The rules of the program should be:
- Exit upon detecting an invalid # of arguments (must be 2)
- argument1 and argument2 are the files to compare
- Exit if any of the input files are empty
- One preference entry per line
- Lines with a path key will have a path value, guaranteed
The output should be as follows:
Part 1
-
All lines from argument1 which have keys not present in argument2, like so:
/path 42
/path2 banana-
Blank line
Part 2
-
All keys that are present in both argument1 and argument2, like so:
/shared (valueInArgument1, valueInArgument2)
/shared2 (valueInArgument1, valueInArgument2)-
Blank line
Part 3
-
All lines from
argument2 which have keys not present in argument1, like so:/path3 24
/path4 ananabNotice Part 1, Part 2, and Part 3 must all be sorted.
I would like to get something:
- Pythonic
- Efficient (do as little work as possible computationally, use as little space as possible)
- Clear and readable from a logical point of view
- Correct (gets the right result even in edge cases)
- Instructive (uses data structures and algorithms properly)
Here's my attempt:
```
#!/usr/bin/env /opt/local/bin/python2.7
import sys
import os
import re
import pprint
COMMAND_SYNTAX_ERROR = 2
EMPTY_PREFS_FILE_ERROR = 3
PREFS_REGEX_PATTERN = '^(.?)=(.)$'
PREFS_REGEX = re.compile(PREFS_REGEX_PATTERN)
def parse_prefs_line(line):
regex_result = PREFS_REGEX.match(line)
if not regex_result:
return None, None
return regex_result.group(1), regex_result.group(2)
arguments = sys.argv
n_arguments = len(arguments)
if n_arguments != 3:
print 'usage: e
Solution
Don't optimize unless your profiler says so.
You could start with the simplest code that works e.g., here's a straightforward translation of your requirements:
You could compare results and the performance with your code.
You could also compare it the
You could start with the simplest code that works e.g., here's a straightforward translation of your requirements:
import sys
def get_entries(filename):
with open(filename) as file:
# extract 'key = value' entries
entries = (map(str.strip, line.partition('=')[::2]) for line in file)
#note: if keys are repeated the last value wins
# enforce non-empty values, skip comments
return {key: value for key, value in entries
if value and not key.startswith('#')}
if len(sys.argv) != 3:
sys.exit(2) # wrong number of arguments
d1, d2 = map(get_entries, sys.argv[1:])
if not (d1 and d2):
sys.exit(1) # no entries in a file
def print_entries(keys, d, d2=None):
for k in sorted(keys):
value = d[k] if d2 is None else "(%s, %s)" % (d[k], d2[k])
print k, value
print
print_entries(d1.viewkeys() - d2.viewkeys(), d1)
print_entries(d1.viewkeys() & d2.viewkeys(), d1, d2)
print_entries(d2.viewkeys() - d1.viewkeys(), d2)You could compare results and the performance with your code.
You could also compare it the
comm command from coreutils:$ comm <(sort file1) <(sort file2)Code Snippets
import sys
def get_entries(filename):
with open(filename) as file:
# extract 'key = value' entries
entries = (map(str.strip, line.partition('=')[::2]) for line in file)
#note: if keys are repeated the last value wins
# enforce non-empty values, skip comments
return {key: value for key, value in entries
if value and not key.startswith('#')}
if len(sys.argv) != 3:
sys.exit(2) # wrong number of arguments
d1, d2 = map(get_entries, sys.argv[1:])
if not (d1 and d2):
sys.exit(1) # no entries in a file
def print_entries(keys, d, d2=None):
for k in sorted(keys):
value = d[k] if d2 is None else "(%s, %s)" % (d[k], d2[k])
print k, value
print
print_entries(d1.viewkeys() - d2.viewkeys(), d1)
print_entries(d1.viewkeys() & d2.viewkeys(), d1, d2)
print_entries(d2.viewkeys() - d1.viewkeys(), d2)$ comm <(sort file1) <(sort file2)Context
StackExchange Code Review Q#14774, answer score: 5
Revisions (0)
No revisions yet.