HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Replacing the second word of some matching lines in a file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
thefilewordreplacingsomesecondmatchinglines

Problem

I have a file that at some point contains a line of the following form:

OutputFilename output.root


I want to write a Python function that replaces the text after "OutputFilename" and saves to a new file.

I have made the following attempt, which works, but I'm sure there must be a more efficient, Pythonic way to do this.

def set_cuts_configuration_output_filename_configuration(
    cuts_filename                 = "cuts.txt",
    output_filename_configuration = "output_new.root",
    output_cuts_filename          = "cuts_new.txt"
    ):
    # Access the cuts configuration.
    with open (cuts_filename, "r") as input_file:
        cuts_file = input_file.readlines()
    # Find the output filename line.
    modified_file = []
    for line in cuts_file:
        if "OutputFilename" in line:
            modified_file.append(line.split(" ")[0] + " " + output_filename_configuration + "\n")
        else:
            modified_file.append(line)
    # Save the new file content.
    output_cuts_file = open(output_cuts_filename, "w")
    for line in modified_file:
        output_cuts_file.write(line)
    output_cuts_file.close()

Solution

Here are some comments to your solution:

  • Try avoiding reading the entire files into memory – Using readlines() means you read the entire file into memory, which can be eat a lot of memory. Usually it is better to read and process the file line by line



  • Use with open(...) for pythonic file reading/writing – This can be used both for reading and writing files, and can also be combined into a double with in your case to read and write at the same time



  • Make functions a little more generic – To have a function to change just one explicitly named config value seems a little strange. Two natural extensions: 1) Allow for an arbitrary config value to be changed, 2) Allow for multiple config values to be changed in one go.



  • Note that "OutputFilename" in line matches in the entire line – If you write "OutputFilename" at end of line it matches, or at start, or as part of another word. This could lead to a lot of failing changes.



  • Set the new config line directly – If you have a positive hit on the line you want to change, you know how the line is supposed to look, so no need to split, append and change it. Set it directly to the correct value.



  • Add some blank lines in your script to make it easier to read – A lot of people are afraid of having blank lines in script, and that is said. It can make the script a lot easier to read.



Refactored code

Here is two functions (within a program) to illustrate a better approach:

def change_configuration_option(
    option,
    new_value,
    sourcefile = "cuts.txt",
    destinationfile = "cuts_new.txt"):
    """Change option_name in sourcefile to new_value in destinationfile."""

    with open(sourcefile) as source, open(destinationfile, "w") as destination:
        for line in source:

            if option in line: # Buggy match on line...
                destination.write('{} {}\n'.format(option, new_value))
            else:
                destination.write(line)

def change_configuration_options(
    change_options,
    sourcefile="cuts.txt",
    destinationfile = "cuts_multiple.txt"):
    """From sourcefile to destinationfile do change_options."""

    with open(sourcefile) as source, open(destinationfile, "w") as destination:
        for line in source:

            # For every line, match against new_options
            for option in change_options:
                if line.startswith(option + ' '):
                    destination.write('{} {}\n'.format(option, change_options[option]))
                    break

            else:
                destination.write(line)

def main():
    """Replace output.root with new_output.root in config file."""

    change_configuration_option("OutputFilename", "new_output_file.txt")

    change_configuration_options(
        { "OutputFilename" : "new_output_file.txt",
          "YetAnother" : "bites the dust",
          "AnotherConfig" : "And another one",
         })

if __name__ == '__main__':
    main()


The change_configuration_options() takes a dictionary of new configuration option changes, and it uses a little trick within the inner loop to check write non-matching options to the output file. That trick is the for ... else loop. If the for loop completes normally, the else will execute, but if a matching option is found and it break's out of the loop then the else: will not be executed.

Also note that in the multiple variant, I've changed the matching on the line to be line.startswith(option + " "), which limits the option to be at the start of the line, and a complete word followed by a space. This avoids false hits, as are present in your original code and in my change_configuration_option() for changing the single option. (Left as proof of concept, see tests below)

Test run

I tested this using the following cuts.txt file:

MyOutputFilename Not_this_file
OutputFilename this_file_should_change
# Next line should have value "no change"
OutputFilenameAndSome no change
# Multiple
AnotherConfig AnotherValue
YetAnother config value with multiple spaces
# Flawed
FinalFlaw What if the value has OutputFilename as part of text?


And after running, I got the file cuts_new.txt:

OutputFilename new_output_file.txt
OutputFilename new_output_file.txt
# Next line should have value "no change"
OutputFilename new_output_file.txt
# Multiple
AnotherConfig AnotherValue
YetAnother config value with multiple spaces
# Flawed
OutputFilename new_output_file.txt


Notice the duplicated line near start, due to falsely changing the OutputFilenameAndSome option, and the changed bottom line when comparing to original file.

And finally the file cuts_multiple.txt:

MyOutputFilename Not_this_file
OutputFilename new_output_file.txt
# Next line should have value "no change"
OutputFilenameAndSome no change
# Multiple
AnotherConfig And another one
YetAnother bites the dust
# Flawed
FinalFlaw What if the value has OutputFilename as part of text?


This file has only the intended changes. The latter

Code Snippets

def change_configuration_option(
    option,
    new_value,
    sourcefile = "cuts.txt",
    destinationfile = "cuts_new.txt"):
    """Change option_name in sourcefile to new_value in destinationfile."""

    with open(sourcefile) as source, open(destinationfile, "w") as destination:
        for line in source:

            if option in line: # Buggy match on line...
                destination.write('{} {}\n'.format(option, new_value))
            else:
                destination.write(line)


def change_configuration_options(
    change_options,
    sourcefile="cuts.txt",
    destinationfile = "cuts_multiple.txt"):
    """From sourcefile to destinationfile do change_options."""

    with open(sourcefile) as source, open(destinationfile, "w") as destination:
        for line in source:

            # For every line, match against new_options
            for option in change_options:
                if line.startswith(option + ' '):
                    destination.write('{} {}\n'.format(option, change_options[option]))
                    break

            else:
                destination.write(line)


def main():
    """Replace output.root with new_output.root in config file."""

    change_configuration_option("OutputFilename", "new_output_file.txt")

    change_configuration_options(
        { "OutputFilename" : "new_output_file.txt",
          "YetAnother" : "bites the dust",
          "AnotherConfig" : "And another one",
         })

if __name__ == '__main__':
    main()

Context

StackExchange Code Review Q#111639, answer score: 4

Revisions (0)

No revisions yet.