HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Python path-breaking script for use in R

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
scriptpathbreakingforpythonuse

Problem

R does not support line continuation as Python, and it's disturbing to have long file paths like


/media/user/data/something-not-very-important-but-super-long/some-curious-secret-file.pdf

and can't break it into multiple lines.

The workaround is to use the file.path function:

file.path(
  "/root",
  "folder",
  "file.ods")


Manually editing is tedious, so here is a Python script that does the job for you:

#!/usr/bin/env python3

import sys

x = sys.argv[1]
# split by unix path separator
y = x.split("/")
# get rid of empty strings
y = [i for i in y if i]

# preserve / if it's an absolute path
if x.startswith("/"):
    y[0] = "/" + y[0]

# quote everything
y = ["'" + i + "'" for i in y]

res = "file.path(" + ",\n    ".join(y) + ")"
print(res)

Solution

Always throw things in main:

def main():
    # code

main()


It might seem pointless but it does help with preventing global pollution, which is good if you ever decide to add a function somewhere.

Using

x = sys.argv[1]


is acceptable for a trivial script, but it's not much harder to use docopt and implement a proper command-line parser:

"""
Name Of Program.

Usage: prog.py 
"""

import docopt

def main():
    args = docopt.docopt(__doc__)

    x = args[""]


Trivial, but it gives you --help, error reporting and sane handling of bad input, nearly for free.

This can be simplified a little with pathlib:

parts = pathlib.PurePosixPath(args[""]).parts

if len(parts) >= 2 and parts[0] == "/":
    parts = ("/" + parts[1],) + parts[2:]


You can actually use just pathlib.Path but it may assume a Windows format on Windows machines.

Your quoting:

parts = ["'" + i + "'" for i in parts]


should be done with repr:

parts = map(repr, parts)


and the formatting should use .format:

res = "file.path({})".format(",\n    ".join(parts))


This gives

#!/usr/bin/env python3

"""
Name Of Program.

Usage: prog.py 
"""

import docopt
import pathlib

def main():
    args = docopt.docopt(__doc__)
    parts = pathlib.PurePosixPath(args[""]).parts

    if len(parts) >= 2 and parts[0] == "/":
        parts = ("/" + parts[1],) + parts[2:]

    args = ",\n    ".join(map(repr, parts))
    print("file.path({})".format(args))

main()


Technically PEP 8 says docopt should be separated from pathlib, the second being in the stdlib, but it looks nicer this way for now.

Code Snippets

def main():
    # code

main()
x = sys.argv[1]
"""
Name Of Program.

Usage: prog.py <pathname>
"""

import docopt

def main():
    args = docopt.docopt(__doc__)

    x = args["<pathname>"]
parts = pathlib.PurePosixPath(args["<pathname>"]).parts

if len(parts) >= 2 and parts[0] == "/":
    parts = ("/" + parts[1],) + parts[2:]
parts = ["'" + i + "'" for i in parts]

Context

StackExchange Code Review Q#77548, answer score: 4

Revisions (0)

No revisions yet.