Working with namedtuple in Python

A quick refresher on Python namedtuples, what makes them special, and how to properly implement the methods and attributes they provide.

Background

Python's namedtuple is a useful tool that extends the built-in type tuple with features that make working with simple data structures easy. While I encounter and implement namedtuple often, it's been a while since I've taken a peek under the hood. This discussion serves as a refresher (mostly for me), and hopefully you learn something new.

Examples presented assume Python 3.12.

What is namedtuple Anyway?

It might be useful to clarify exactly what is namedtuple. Perhaps our experience with namedtuple is purely practical and we haven't considered exactly what it is we're importing from collections. Let's consider the following example. Additionally we'll use this same example throughout the discussion.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'])

Here I'm using namedtuple to create a data structure that represents a Cartesian Coordinate with fields x and y and labeling the resulting class "Point."

So what's happening in this code snippet? I know the built-in type tuple is a class and my resulting Point appears to be one too.

import inspect
from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'])

print(inspect.isclass(tuple))
print(inspect.isclass(Point))

# output
True
True

So then namedtuple is a subclass of tuple? Actually, no.

import inspect
from collections import namedtuple


print(inspect.isclass(namedtuple))

# output
False

When you take a closer look at the naming convention, namedtuple being lowercase hints it's likely a function rather than a class (PEP 8) (although exceptions exist to PEP 8 convention: think str, int, and tuple itself). More specifically, we have a class factory. This factory function pattern is how namedtuple can start with the tuple class, extend it, and return a new custom class.

However, when you hear namedtuple conversationally, the intended meaning is likely the resulting class that the factory function creates. Just be aware of the distinction.

Now let's see how to use namedtuple.

How to Use namedtuple

Let's examine the factory function's signature to understand how to use it.

import inspect
from collections import namedtuple


inspect.signature(namedtuple)

# output
<Signature (typename, field_names, *, rename=False, defaults=None, module=None)>

Above we see we have two positional arguments typename and field_names.

The first argument typename is what we want to call the resulting class. The typename should match the variable name you're assigning to: Point = namedtuple('Point', ...). Ultimately, this value will become the resulting class's attribute value for __name__.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'])

print(Point.__name__)

# output
Point

The second required positional argument is field_names. In my example, I provided a list of strings representing the field names I desired: 'x' and 'y'. For convenience, namedtuple actually gives us a bit of flexibility in how we pass the field_names argument. In addition to a sequence, you can pass either a comma or space separated string.

The field_names are a sequence of strings such as ['x', 'y']. Alternatively, field_names can be a single string with each fieldname separated by whitespace and/or commas, for example 'x y' or 'x, y'.

(docs)

We can see this implementation in the source.

# LOC 387 in the namedtuple factory function, see permalink above as "source"

if isinstance(field_names, str):
    field_names = field_names.replace(',', ' ').split()
field_names = list(map(str, field_names))

Using these options, I could have written my example as Point = namedtuple('Point', 'x, y') or even Point = namedtuple('Point', 'x y') with only spaces. Pretty cool.

Next up, we have three keyword-only parameters for rename, defaults, and module.

The keyword argument rename instructs the function to handle invalid field_names if set True and renames them to _[index] so an exception isn't raised. If you're generating a namedtuple dynamically, you might consider setting this True. Otherwise, I'd leave the default to False so bugs have greater visibility.

Then there's the keyword-only parameter defaults. The defaults keyword I'll touch on in the section "Providing Defaults." Given there's more than one way to approach default configurations with namedtuple, diving in here would be an incomplete analysis.

And last the module parameter is useful for avoiding problems relating to pickling. If you define a namedtuple within a nested scope (like a function) and plan to pickle/un-pickle it, you might need to manually set module or (using our example) Point.__module__ could be incorrect for pickling. But in most cases the default will be appropriate and you can leave this alone.

Now let's look at why to use namedtuple.

Benefits of namedtuple

Python's namedtuple possesses all of the benefits of the built-in sequence type tuple. It's a lightweight data structure containing an immutable sequence of often heterogeneous data. And that's to be expected since namedtuple is a subclass of tuple.

Inheritance

If we look to the source, we can see how the inheritance occurs.

# LOC 508 in the namedtuple factory function, see permalink below

result = type(typename, (tuple,), class_namespace)

(source code)

On this line the built-in function type is being used to create a new class of type "typename" (the first argument of namedtuple), which inherits from the bases (tuple,) and provides the namespace class_namespace.

That may look a little complex if you're not used to working with the type function, so let's show this in action with a simple example. I'm going to make a simple subclass of tuple called "MyTuple" with a single method in the namespace called "first" which will return the first item in the sequence, if one exists.

def first(self):
    # simplest
    # return self[0] if self else None
    # -or- use tuple methods for demonstration
    if self.__len__() >= 1:
        # item in first index
        return self.__getitem__(0)
    return None

namespace = {
    'first': first,
    }

MyTuple = type('MyTuple', (tuple,), namespace)

mt = MyTuple((1,2,3))

print(type(mt))
print(issubclass(mt.__class__, tuple))
print(mt.first())

# output
<class '__main__.MyTuple'>
True
1

From the output you can see that we indeed have an instance that inherits from tuple and has access to all its methods. In our case, we simply extended the base class tuple a bit to add functionality. This is what namedtuple accomplishes.

Extension

What does namedtuple add to tuple? While it provides several useful methods like _make, _asdict, _replace (which we'll explore later), the most valuable extension is named attribute access—the ability to use dot notation instead of numeric indices.

For example:

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'])


p = Point(10, 20)
print(p.x)
print(p.y)

# output
10
20

Whereas, without this ability, we'd be restricted to accessing tuple elements by index only.

p = (10, 20)

print(p[0])
print(p[1])

# output
10
20

While valid, this approach lacks semantic meaning. You must remember what each index represents, which is error-prone compared to namedtuple's self-documenting fields.

Now that we've seen the benefit of accessing values with named attributes, this may still not seem too special—it's how Python objects normally work after all:

obj = type('MyObject', (object,), {})
obj.x = 10

print(obj.x)

# output
10

But namedtuple fields have special constraints: they're immutable yet dynamically created based on the field names provided.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'])
p = Point(10, 20)

p.x = 30

# output
AttributeError: can't set attribute

This controlled behavior, allowing reads but preventing writes, is an indication that something more sophisticated is happening behind the scenes. These aren't just simple attributes, they're descriptors. As a reminder, descriptors are:

Any object which defines the methods __get__(), __set__(), or __delete__().

(docs)

Why must these be descriptors? In Python, regular attributes are typically mutable—you can reassign them freely. But namedtuple fields must remain immutable to preserve tuple semantics. Achieving this read-only behavior requires a descriptor with a __set__ method that prevents modifications (or by simply not implementing a setter method).

Let's visualize how this would be if we defined our own logic with static attributes so we can see descriptors in action.

class Point:
    
    def __init__(self, x, y):
        self._x = x
        self._y = y
        
    @property
    def x(self):
        return self._x
    
    @x.setter
    def x(self, _):
        raise AttributeError("Unable to set attr 'x'")
                             
    @property
    def y(self):
        return self._y

    @y.setter
    def y(self, _):
        raise AttributeError("Unable to set attr 'y'")

p = Point(10, 20)
print(p.x)
p.x = 30

# output
10
AttributeError: Unable to set attr 'x'

We can further refine this example to get closer to the namedtuple implementation.

class Point(tuple):
    
    def __new__(cls, x, y):
        return tuple.__new__(cls, (x, y))
        
    @property
    def x(self):
        return self[0]
    
    @x.setter
    def x(self, _):
        raise AttributeError("Unable to set attr 'x'")
                             
    @property
    def y(self):
        return self[1]

    @y.setter
    def y(self, _):
        raise AttributeError("Unable to set attr 'y'")


print(issubclass(Point, tuple))
p = Point(10, 20)
print(p.x)
p.x = 30

# output
True
10
AttributeError: Unable to set attr 'x'

Indeed, the Python implementation uses the same property approach detailed above. Here's how each field is dynamically added to the namespace:

# LOC 504 in the namedtuple factory function, see permalink below

for index, name in enumerate(field_names):
    doc = _sys.intern(f'Alias for field number {index}')
    class_namespace[name] = _tuplegetter(index, doc)

(source code)

Then if we inspect _tuplegetter we see:

# LOC 353 just prior to the namedtuple factory function in the collections modules
# see permalink below

_tuplegetter = lambda index, doc: property(_itemgetter(index), doc=doc)

(source code)

While my example isn't dynamic nor possessing the methods of namedtuple, we have a tuple instance with immutable properties x and y implemented with the property built-in function. This should provide a basic understanding of what's generally happening in simple terms.

Providing Defaults

Often it's beneficial to provide defaults when working with namedtuples. Consider situations where you're using namedtuples to capture data retrieved from a database table. The namedtuple is an excellent choice for simple data structures. It's lightweight, immutable and you have a handful of methods at your disposal. However, if any columns in the table have nullable fields, you have an issue. When you attempt to create an instance of your namedtuple class, an exception will be raised for missing values.

To accommodate for this, we have three main approaches we can utilize: the defaults parameter, updating the __new__.__defaults__ attribute directly on the class, or creating a prototype namedtuple class instance.

First, let's take a look at utilizing the defaults keyword-only parameter. This argument can either be a sequence of default values or set to None (the default value). As an example, I'm going to update my Cartesian Coordinate example to have a default position of (0,0). This way, if parameters x or y, are missing or no argument is provided, an exception won't be raised.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'], defaults=(0, 0))


p = Point()
print(p.x)
print(p.y)

# output
0
0

Simple enough. Even without any arguments supplied, a Point instance was still created with coordinates at the origin and no exceptions were raised.

Note, when you supply defaults, you need to be particularly careful in understanding what gets applied to where. Default values are applied to the rightmost parameters. This is consistent with Python's default behavior (no pun intended) in handling defaults.

Default values are applied to the rightmost parameters first.

For example:

Point = namedtuple('Point', ['x', 'y', 'z'], defaults=(0, 0))

In this configuration, I have three field names and two defaults. Which fields get the defaults? Only y and z receive defaults. The field x will still be required and an exception will be raised if a value isn't supplied.

The next way we can supply default values is by modifying the __defaults__ attribute directly. This is sort of a shortcut to the previous example since ultimately this is what occurs anyways. You're just bypassing the factory function's implementation of defaults.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y',])
Point.__new__.__defaults__ = (0, 0)


p = Point()
print(p.x)
print(p.y)

# output
0
0

In fact, if we supply defaults both ways at the same time, you'll see that the latter method overrides the former.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'], defaults=(0, 0))
Point.__new__.__defaults__ = (1, 1)


p = Point()
print(p.x)
print(p.y)

# output
1
1

The last way to create a namedtuple with defaults is to use the combination of a prototype namedtuple instance and the _replace method. We know that tuples are immutable so _replace can't actually be replacing elements. And of course, it isn't. Instead the _replace method returns a new namedtuple instance with the arguments provided replacing the values of the instance the method was called on. The argument should be passed as a dictionary mapping field names and values.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'])
prototype_point = Point(0, 0)

p = prototype_point._replace(**{'x': 1, 'y': 1})
print(p.x)
print(p.y)

# output
1
1

I should say this is my least favorite of the options available. It's less semantic than the earlier options. I prefer using either the defaults parameter or updating the __new__.__defaults__ attribute directly. And if we're being honest the former of the two. Python's developers put that parameter there for a reason.

Last, if you'd like to retrieve what the default values are from the class or instance, you can do so by accessing the _field_defaults attribute.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'], defaults=(0, 0))

p = Point()

print(Point._field_defaults)
print(p._field_defaults)

# output
{'x': 0, 'y': 0}
{'x': 0, 'y': 0}

Other Useful Stuff

A couple of other methods and attributes I haven't discussed yet are _make, _asdict, and _fields.

The first method _make, is useful in converting an existing tuple to a namedtuple that's already been structured to map correctly. For example, you have a data API that returns a tuple that happens to align with our coordinate example. We can easily convert the simpler tuple into a namedtuple with _make.

The method expects a single argument containing an iterable.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'], defaults=(0, 0))

data = (10, 20)

p = Point._make(data)

print(p)

# output
Point(x=10, y=20)

Next, we have the _asdict method. This is one of the more often used methods you'll see in production. As the name implies, this method simply maps field names with their corresponding values and constructs a new dictionary.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'])


p = Point(10, 20)

print(p._asdict())

# output
{'x': 10, 'y': 20}

Note: prior to Python 3.8, the output was an OrderedDict. Because from the release of Python 3.7 onward the ordering of keys in basic dictionaries is guaranteed, OrderedDict was replaced by dict.

And last, we have the _fields attribute. This attribute becomes particularly useful when you want to extend an already structured namedtuple class and you don't want to redefine it from scratch.

from collections import namedtuple


Point = namedtuple('Point', ['x', 'y'])
Point3D = namedtuple('Point3D', Point._fields + ('z',))

p = Point3D(10, 20, 30)

print(p)

# output
Point3D(x=10, y=20, z=30)

This might seem unnecessary for such a simple example, but imagine a namedtuple class with many fields that would be cumbersome to redefine in addition to a DRY violation. This approach allows you to gracefully extend what's already been established.

One last thing to note, while _make is a classmethod and _fields is a class attribute (both accessible without an instance), _asdict and _replace (discussed in the section earlier) are instance methods.

Final Thoughts

While namedtuple is a data structure you see often, it's easy to take for granted what's happening behind the curtains. This discussion peeled back many of the layers to expose some of the internals of namedtuple as well as provide examples for common use cases.

I enjoy working with namedtuples. It's one of my favorite tools in the Python standard library toolkit. And I hope after this article I've left you with a greater appreciation for this powerful feature and are better equipped to utilize it in your own work.

Working with namedtuple in Python

Background

What is namedtuple Anyway?

How to Use namedtuple

Benefits of namedtuple

Inheritance

Extension

Providing Defaults

Other Useful Stuff

Final Thoughts

Details

Topics

Tags

Next

The Walrus is Underrated