HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

setup.py with data_files and __author__ parsing

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
withsetupdata_filesparsing__author__and

Problem

All my projects have a fairly similar package layout:

package_name/
package_name/setup.py
package_name/package_name/__init__.py
package_name/package_name/data/foo.txt


package_name/package_name/__init__.py contains:

__version__ = '0.0.1'
__author__ = 'Alec Taylor '


With a setup.py like:

from setuptools import setup, find_packages
from os import path, listdir
from functools import partial
from itertools import imap, ifilter
from ast import parse
from pip import __file__ as pip_loc

if __name__ == '__main__':
    package_name = 'foooooo'

    f_for = partial(path.join, path.dirname(__file__), package_name)
    d_for = partial(path.join,path.dirname(path.dirname(pip_loc)), package_name)
    to_funcs = lambda name: (partial(path.join, f_for(name)),
                             partial(path.join, d_for(name)))

    _data_join, _data_install_dir = to_funcs('data')

    get_vals = lambda var0, var1: imap(
                   lambda buf: next(imap(lambda e: e.value.s, parse(buf).body)),
                       ifilter(lambda line: line.startswith(var0) or
                                            line.startswith(var1), f))

    with open(path.join(package_name, '__init__.py')) as f:
        __author__, __version__ = get_vals('__version__', '__author__')

    setup(name=package_name,
        author=__author__,
        version=__version__,
        test_suite=package_name + '.tests',
        packages=find_packages(),
        package_dir={package_name: package_name},
        data_files=[
            (_data_install_dir(), map(_data_join, listdir(_data_join())))
        ]
    )


This setup.py has been tested on Windows and Linux with Python 2.7.

PS: I'm planning to upgrade all my packages to work with 2.7 and 3+.

Solution

Your use of lambdas and partials makes no sense, and is really hard to understand. If you need them to be functions then just make them functions. Or even better, just write the code out. Some of them are just one-off functions and don't add any value, and should be inlined. The others are just ridiculously hard to read, and should be defined as normal functions.

First let's look at how you get the values from the file. I'm pretty sure I know what the general idea of your algorithm is.

for line in source file
if the line has one of the values we want
parse the line, and save the value


This is much easier to implement like this

def extract_values(source_file, desired_vars):
with open(source_file) as f:
for line in f:
if any(line.startswith(var) for var in desired_vars):
parsed = ast.parse(line).body[0]
yield parsed.targets[0].id, parsed.value.s

filename = path.join(package_name, '__init__.py')
metadata_names = '__author__', '__version__'
values = dict(extract_values(filename, metadata_names))


Using all of your lambdas and iterables doesn't seem to give much value - I assume you were doing them because this is potentially a big file and you don't want it all in memory? If so don't worry about it - doing the for line in f automatically does buffered-IO and memory management. Then we can use a generator and construct a dict from it. This is much more readable, and is unlikely to make much of a difference in performance. More importantly, however, is that unless you're doing something ridiculous in setup.py it shouldn't really matter.

Next let's look at how you get the data files. I cannot figure out why you did it this way. This line:

data_files=[
    (_data_install_dir(), map(_data_join, listdir(_data_join())))
]


is basically nonsensical. You map _data_join onto each file inside of calling _data_join()? What we really have here are two constants - some data install directory, and some data file directory. We want to list all of the files in the data file directory, and pair them with the data install directory. We can just do something like this

data_files = data_install_folder, [path.join(data_file_source, filename) for filename in listdir(data_file_source)]


Putting this all together, we have something like this

from setuptools import setup, find_packages
import os
import ast

from compatibility import filter, map
from pip import __file__ as pip_loc

def extract_values(source_file, desired_vars):
    with open(source_file) as f:
        for line in f:
            if any(line.startswith(var) for var in desired_vars):
                parsed = ast.parse(line).body[0]
                yield parsed.targets[0].id, parsed.value.s

if __name__ == '__main__':
    package_name = 'foooooo'

    filename = os.path.join(package_name, '__init__.py')
    metadata_names = '__author__', '__version__'
    values = dict(extract_values(filename, metadata_names))

    data_file_source = os.path.join(os.path.dirname(__file__), package_name)
    data_install_folder = os.path.join(os.path.dirname(path.dirname(pip_loc)), package_name)

    all_data_files = os.path.join(data_file_source, filename) for filename in listdir(data_file_source)
    data_files = [(data_install_folder, all_data_files)]

    setup(name=package_name,
        author=values['__author__'],
        version=values['__version__'],
        test_suite=package_name + '.tests',
        packages=find_packages(),
        package_dir={package_name: package_name},
        data_files=data_files
    )


As a quick note, if you want to get things working for both Python 2 and Python 3, imap and ifilter don't exist in Python 3. I usually like to make a file like compatibility.py that I put things like that into, then just import from that.

try:
    from itertools import imap as map, ifilter as filter
except ImportError:
    pass

Code Snippets

data_files=[
    (_data_install_dir(), map(_data_join, listdir(_data_join())))
]
data_files = data_install_folder, [path.join(data_file_source, filename) for filename in listdir(data_file_source)]
from setuptools import setup, find_packages
import os
import ast

from compatibility import filter, map
from pip import __file__ as pip_loc


def extract_values(source_file, desired_vars):
    with open(source_file) as f:
        for line in f:
            if any(line.startswith(var) for var in desired_vars):
                parsed = ast.parse(line).body[0]
                yield parsed.targets[0].id, parsed.value.s


if __name__ == '__main__':
    package_name = 'foooooo'

    filename = os.path.join(package_name, '__init__.py')
    metadata_names = '__author__', '__version__'
    values = dict(extract_values(filename, metadata_names))

    data_file_source = os.path.join(os.path.dirname(__file__), package_name)
    data_install_folder = os.path.join(os.path.dirname(path.dirname(pip_loc)), package_name)

    all_data_files = os.path.join(data_file_source, filename) for filename in listdir(data_file_source)
    data_files = [(data_install_folder, all_data_files)]

    setup(name=package_name,
        author=values['__author__'],
        version=values['__version__'],
        test_suite=package_name + '.tests',
        packages=find_packages(),
        package_dir={package_name: package_name},
        data_files=data_files
    )
try:
    from itertools import imap as map, ifilter as filter
except ImportError:
    pass

Context

StackExchange Code Review Q#138495, answer score: 4

Revisions (0)

No revisions yet.