HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Creating a Python namedtuple class that forces some of its fields to be floats

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
creatingfieldsnamedtuplefloatsitsforcesthatpythonsomeclass

Problem

I've got a use case for namedtuple that requires some of the fields to be floats. I could require the code that's constructing the namedtuple to supply floats, but I'd like for the namedtuple class to be smart enough to convert strings it gets into floats for me (but only for a specified subset of fields). I have code to do this, but it's ugly:

from collections import namedtuple, OrderedDict

# This list is built at runtime in my application
_FIELDS_AND_TYPES = [("num1", True), ("num2", True), ("label", False)]
FIELDS_FORCE_FLOAT = OrderedDict(_FIELDS_AND_TYPES)

_MyNamedTuple = namedtuple("_MyNamedTuple", FIELDS_FORCE_FLOAT.keys())
class TypedNamedTuple(_MyNamedTuple):
    fields_force_float = FIELDS_FORCE_FLOAT
    def __new__(cls, *values_in):
        super_obj = super(_MyNamedTuple, cls)
        superclass = super_obj.__thisclass__
        values = [float(value) if cls.fields_force_float[fieldname] else value
                    for fieldname, value in zip(superclass._fields, values_in)]
        self = super_obj.__new__(cls, values)
        return self

print TypedNamedTuple("1.0", "2.0", "3.0")
# _MyNamedTuple(num1=1.0, num2=2.0, label='3.0')


Things I don't like about this code:

  • The output of print TypedNamedTuple("1.0", "2.0", "3.0") starts with _MyNamedTuple rather than TypedNamedTuple. I think this is a general problem with subclassing namedtuples, so fair enough. I could give both classes the same name to solve this.



  • My code has to pull in FIELDS_FORCE_FLOAT from a global variable.



  • TypedNamedTuple is inefficient (it runs zip and does a bunch of dictionary lookups). This is not so bad in my context but I'd like to know how to handle this "right."



  • If I wanted to create another namedtuple subclass that forces some of its args to be float, I'd basically be starting from scratch. My TypedNamedTuple is not reusable.



Is there a cleaner way to do this?

As for why I'm attempting to use namedtuples here at all (H/T @jonrsharpe),

Solution

In order to reuse code (that is, namedtuple) it should be okay to use
this approach. It might also make things harder in the long run, so it
could be easier to use a regular meta-/class or decorator to enforce this kind of
logic. I also see a problem with automatic conversion of values, as
that needs to be communicated/debugged later on, but if it fits the use
case, why not. Since the tuple is immutable there's no normal way to
circumvent this, so I guess it is less of an issue than with a mutable
class.

Also, have you looked at the collections.py source? I mean it's good
that it works, but honestly that code generation is a bit scary. It is, however, a way to generate all this stuff at runtime. If you don't have to have separate Python classes you could also just use a single class and pass in the valid fields as a separate parameter, then use functions to create the separate tuple "types".

For the presented code / your questions I have some other suggestions:

  • You can override __repr__ to fix the "wrong" printed


representation. I'm not quite sure exactly why it is kind of
hardcoded in collections.py, but generally I'd prefer things to
print their own class name instead, as written below. The
formatting is copied from the standard namedtuple.

  • If it's happening once, that is not nice, but certainly not the


biggest issue. Again, you could follow the code generation model
and pass it in as a parameter,
e.g. typednamedtuple('TypedNamedTuple', FIELDS_AND_TYPES).

  • Use izip to prevent allocating a new list. Otherwise looks good?


I can't see how you would otherwise do the conversion.

  • Subclass and override the field? Note the AnotherTypedNamedTuple


below. Otherwise see above.

Also:

  • Don't assign to self. That looks super wrong.



from collections import namedtuple, OrderedDict
from itertools import izip

# This list is built at runtime in my application
_FIELDS_AND_TYPES = [("num1", True), ("num2", True), ("label", False)]
FIELDS_FORCE_FLOAT = OrderedDict(_FIELDS_AND_TYPES)

_MyNamedTuple = namedtuple("_MyNamedTuple", FIELDS_FORCE_FLOAT.keys())
class TypedNamedTuple(_MyNamedTuple):
fields_force_float = FIELDS_FORCE_FLOAT

def __new__(cls, *values_in):
super_obj = super(_MyNamedTuple, cls)
superclass = super_obj.__thisclass__
values = [float(value) if cls.fields_force_float[fieldname] else value
for fieldname, value in izip(superclass._fields, values_in)]
return super_obj.__new__(cls, values)

def __repr__(self):
return "{}({})".format(
self.__class__.__name__,
', '.join("{}=%r".format(name) for name in self._fields) % self)

print TypedNamedTuple("1.0", "2.0", "3.0")
# TypedNamedTuple(num1=1.0, num2=2.0, label='3.0')

class AnotherTypedNamedTuple(TypedNamedTuple):
fields_force_float = OrderedDict([("num1", False), ("num2", False), ("label", False)])

print AnotherTypedNamedTuple("1.0", "2.0", "3.0")
# AnotherTypedNamedTuple(num1='1.0', num2='2.0', label='3.0')

Context

StackExchange Code Review Q#84492, answer score: 2

Revisions (0)

No revisions yet.