patternpythonMinor
Replacing the second word of some matching lines in a file
Viewed 0 times
thefilewordreplacingsomesecondmatchinglines
Problem
I have a file that at some point contains a line of the following form:
I want to write a Python function that replaces the text after "OutputFilename" and saves to a new file.
I have made the following attempt, which works, but I'm sure there must be a more efficient, Pythonic way to do this.
OutputFilename output.rootI want to write a Python function that replaces the text after "OutputFilename" and saves to a new file.
I have made the following attempt, which works, but I'm sure there must be a more efficient, Pythonic way to do this.
def set_cuts_configuration_output_filename_configuration(
cuts_filename = "cuts.txt",
output_filename_configuration = "output_new.root",
output_cuts_filename = "cuts_new.txt"
):
# Access the cuts configuration.
with open (cuts_filename, "r") as input_file:
cuts_file = input_file.readlines()
# Find the output filename line.
modified_file = []
for line in cuts_file:
if "OutputFilename" in line:
modified_file.append(line.split(" ")[0] + " " + output_filename_configuration + "\n")
else:
modified_file.append(line)
# Save the new file content.
output_cuts_file = open(output_cuts_filename, "w")
for line in modified_file:
output_cuts_file.write(line)
output_cuts_file.close()Solution
Here are some comments to your solution:
Refactored code
Here is two functions (within a program) to illustrate a better approach:
The
Also note that in the multiple variant, I've changed the matching on the line to be
Test run
I tested this using the following
And after running, I got the file
Notice the duplicated line near start, due to falsely changing the
And finally the file
This file has only the intended changes. The latter
- Try avoiding reading the entire files into memory – Using
readlines()means you read the entire file into memory, which can be eat a lot of memory. Usually it is better to read and process the file line by line
- Use
with open(...)for pythonic file reading/writing – This can be used both for reading and writing files, and can also be combined into a doublewithin your case to read and write at the same time
- Make functions a little more generic – To have a function to change just one explicitly named config value seems a little strange. Two natural extensions: 1) Allow for an arbitrary config value to be changed, 2) Allow for multiple config values to be changed in one go.
- Note that
"OutputFilename" in linematches in the entire line – If you write "OutputFilename" at end of line it matches, or at start, or as part of another word. This could lead to a lot of failing changes.
- Set the new config line directly – If you have a positive hit on the line you want to change, you know how the line is supposed to look, so no need to split, append and change it. Set it directly to the correct value.
- Add some blank lines in your script to make it easier to read – A lot of people are afraid of having blank lines in script, and that is said. It can make the script a lot easier to read.
Refactored code
Here is two functions (within a program) to illustrate a better approach:
def change_configuration_option(
option,
new_value,
sourcefile = "cuts.txt",
destinationfile = "cuts_new.txt"):
"""Change option_name in sourcefile to new_value in destinationfile."""
with open(sourcefile) as source, open(destinationfile, "w") as destination:
for line in source:
if option in line: # Buggy match on line...
destination.write('{} {}\n'.format(option, new_value))
else:
destination.write(line)
def change_configuration_options(
change_options,
sourcefile="cuts.txt",
destinationfile = "cuts_multiple.txt"):
"""From sourcefile to destinationfile do change_options."""
with open(sourcefile) as source, open(destinationfile, "w") as destination:
for line in source:
# For every line, match against new_options
for option in change_options:
if line.startswith(option + ' '):
destination.write('{} {}\n'.format(option, change_options[option]))
break
else:
destination.write(line)
def main():
"""Replace output.root with new_output.root in config file."""
change_configuration_option("OutputFilename", "new_output_file.txt")
change_configuration_options(
{ "OutputFilename" : "new_output_file.txt",
"YetAnother" : "bites the dust",
"AnotherConfig" : "And another one",
})
if __name__ == '__main__':
main()The
change_configuration_options() takes a dictionary of new configuration option changes, and it uses a little trick within the inner loop to check write non-matching options to the output file. That trick is the for ... else loop. If the for loop completes normally, the else will execute, but if a matching option is found and it break's out of the loop then the else: will not be executed.Also note that in the multiple variant, I've changed the matching on the line to be
line.startswith(option + " "), which limits the option to be at the start of the line, and a complete word followed by a space. This avoids false hits, as are present in your original code and in my change_configuration_option() for changing the single option. (Left as proof of concept, see tests below)Test run
I tested this using the following
cuts.txt file:MyOutputFilename Not_this_file
OutputFilename this_file_should_change
# Next line should have value "no change"
OutputFilenameAndSome no change
# Multiple
AnotherConfig AnotherValue
YetAnother config value with multiple spaces
# Flawed
FinalFlaw What if the value has OutputFilename as part of text?
And after running, I got the file
cuts_new.txt:OutputFilename new_output_file.txt
OutputFilename new_output_file.txt
# Next line should have value "no change"
OutputFilename new_output_file.txt
# Multiple
AnotherConfig AnotherValue
YetAnother config value with multiple spaces
# Flawed
OutputFilename new_output_file.txt
Notice the duplicated line near start, due to falsely changing the
OutputFilenameAndSome option, and the changed bottom line when comparing to original file.And finally the file
cuts_multiple.txt:MyOutputFilename Not_this_file
OutputFilename new_output_file.txt
# Next line should have value "no change"
OutputFilenameAndSome no change
# Multiple
AnotherConfig And another one
YetAnother bites the dust
# Flawed
FinalFlaw What if the value has OutputFilename as part of text?
This file has only the intended changes. The latter
Code Snippets
def change_configuration_option(
option,
new_value,
sourcefile = "cuts.txt",
destinationfile = "cuts_new.txt"):
"""Change option_name in sourcefile to new_value in destinationfile."""
with open(sourcefile) as source, open(destinationfile, "w") as destination:
for line in source:
if option in line: # Buggy match on line...
destination.write('{} {}\n'.format(option, new_value))
else:
destination.write(line)
def change_configuration_options(
change_options,
sourcefile="cuts.txt",
destinationfile = "cuts_multiple.txt"):
"""From sourcefile to destinationfile do change_options."""
with open(sourcefile) as source, open(destinationfile, "w") as destination:
for line in source:
# For every line, match against new_options
for option in change_options:
if line.startswith(option + ' '):
destination.write('{} {}\n'.format(option, change_options[option]))
break
else:
destination.write(line)
def main():
"""Replace output.root with new_output.root in config file."""
change_configuration_option("OutputFilename", "new_output_file.txt")
change_configuration_options(
{ "OutputFilename" : "new_output_file.txt",
"YetAnother" : "bites the dust",
"AnotherConfig" : "And another one",
})
if __name__ == '__main__':
main()Context
StackExchange Code Review Q#111639, answer score: 4
Revisions (0)
No revisions yet.