Background
Python's namedtuple
is a useful tool that extends the built-in type tuple
with features that make working with simple data structures easy. While I encounter and implement namedtuple
often, it's been a while since I've taken a peek under the hood. This discussion serves as a refresher (mostly for me), and hopefully you learn something new.
Examples presented assume Python 3.12.
What is namedtuple Anyway?
It might be useful to clarify exactly what is namedtuple
. Perhaps our experience with namedtuple
is purely practical and we haven't considered exactly what it is we're importing from collections
. Let's consider the following example. Additionally we'll use this same example throughout the discussion.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
Here I'm using namedtuple
to create a data structure that represents a Cartesian Coordinate with fields x
and y
and labeling the resulting class "Point."
So what's happening in this code snippet? I know the built-in type tuple
is a class and my resulting Point
appears to be one too.
import inspect
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
print(inspect.isclass(tuple))
print(inspect.isclass(Point))
# output
True
True
So then namedtuple
is a subclass of tuple
? Actually, no.
import inspect
from collections import namedtuple
print(inspect.isclass(namedtuple))
# output
False
When you take a closer look at the naming convention, namedtuple
being lowercase hints it's likely a function rather than a class (PEP 8) (although exceptions exist to PEP 8 convention: think str
, int
, and tuple
itself). More specifically, we have a class factory. This factory function pattern is how namedtuple
can start with the tuple
class, extend it, and return a new custom class.
However, when you hear namedtuple
conversationally, the intended meaning is likely the resulting class that the factory function creates. Just be aware of the distinction.
Now let's see how to use namedtuple
.
How to Use namedtuple
Let's examine the factory function's signature to understand how to use it.
import inspect
from collections import namedtuple
inspect.signature(namedtuple)
# output
<Signature (typename, field_names, *, rename=False, defaults=None, module=None)>
Above we see we have two positional arguments typename
and field_names
.
The first argument typename
is what we want to call the resulting class. The typename
should match the variable name you're assigning to: Point = namedtuple('Point', ...).
Ultimately, this value will become the resulting class's attribute value for __name__
.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
print(Point.__name__)
# output
Point
The second required positional argument is field_names
. In my example, I provided a list of strings representing the field names I desired: 'x'
and 'y'
. For convenience, namedtuple
actually gives us a bit of flexibility in how we pass the field_names
argument. In addition to a sequence, you can pass either a comma or space separated string.
The field_names are a sequence of strings such as
['x', 'y']
. Alternatively, field_names can be a single string with each fieldname separated by whitespace and/or commas, for example'x y'
or'x, y'
.
We can see this implementation in the source.
# LOC 387 in the namedtuple factory function, see permalink above as "source"
if isinstance(field_names, str):
field_names = field_names.replace(',', ' ').split()
field_names = list(map(str, field_names))
Using these options, I could have written my example as Point = namedtuple('Point', 'x, y')
or even Point = namedtuple('Point', 'x y')
with only spaces. Pretty cool.
Next up, we have three keyword-only parameters for rename
, defaults
, and module
.
The keyword argument rename
instructs the function to handle invalid field_names
if set True
and renames them to _[index]
so an exception isn't raised. If you're generating a namedtuple
dynamically, you might consider setting this True.
Otherwise, I'd leave the default to False
so bugs have greater visibility.
Then there's the keyword-only parameter defaults
. The defaults
keyword I'll touch on in the section "Providing Defaults." Given there's more than one way to approach default configurations with namedtuple
, diving in here would be an incomplete analysis.
And last the module
parameter is useful for avoiding problems relating to pickling. If you define a namedtuple
within a nested scope (like a function) and plan to pickle/un-pickle it, you might need to manually set module
or (using our example) Point.__module__
could be incorrect for pickling. But in most cases the default will be appropriate and you can leave this alone.
Now let's look at why to use namedtuple
.
Benefits of namedtuple
Python's namedtuple
possesses all of the benefits of the built-in sequence type tuple
. It's a lightweight data structure containing an immutable sequence of often heterogeneous data. And that's to be expected since namedtuple
is a subclass of tuple
.
Inheritance
If we look to the source, we can see how the inheritance occurs.
# LOC 508 in the namedtuple factory function, see permalink below
result = type(typename, (tuple,), class_namespace)
On this line the built-in function type
is being used to create a new class of type "typename" (the first argument of namedtuple
), which inherits from the bases (tuple,)
and provides the namespace class_namespace
.
That may look a little complex if you're not used to working with the type
function, so let's show this in action with a simple example. I'm going to make a simple subclass of tuple
called "MyTuple" with a single method in the namespace called "first" which will return the first item in the sequence, if one exists.
def first(self):
# simplest
# return self[0] if self else None
# -or- use tuple methods for demonstration
if self.__len__() >= 1:
# item in first index
return self.__getitem__(0)
return None
namespace = {
'first': first,
}
MyTuple = type('MyTuple', (tuple,), namespace)
mt = MyTuple((1,2,3))
print(type(mt))
print(issubclass(mt.__class__, tuple))
print(mt.first())
# output
<class '__main__.MyTuple'>
True
1
From the output you can see that we indeed have an instance that inherits from tuple
and has access to all its methods. In our case, we simply extended the base class tuple
a bit to add functionality. This is what namedtuple
accomplishes.
Extension
What does namedtuple
add to tuple? While it provides several useful methods like _make
, _asdict
, _replace
(which we'll explore later), the most valuable extension is named attribute access—the ability to use dot notation instead of numeric indices.
For example:
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(10, 20)
print(p.x)
print(p.y)
# output
10
20
Whereas, without this ability, we'd be restricted to accessing tuple
elements by index only.
p = (10, 20)
print(p[0])
print(p[1])
# output
10
20
While valid, this approach lacks semantic meaning. You must remember what each index represents, which is error-prone compared to namedtuple
's self-documenting fields.
Now that we've seen the benefit of accessing values with named attributes, this may still not seem too special—it's how Python objects normally work after all:
obj = type('MyObject', (object,), {})
obj.x = 10
print(obj.x)
# output
10
But namedtuple
fields have special constraints: they're immutable yet dynamically created based on the field names provided.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(10, 20)
p.x = 30
# output
AttributeError: can't set attribute
This controlled behavior, allowing reads but preventing writes, is an indication that something more sophisticated is happening behind the scenes. These aren't just simple attributes, they're descriptors. As a reminder, descriptors are:
Any object which defines the methods __get__(), __set__(), or __delete__().
Why must these be descriptors? In Python, regular attributes are typically mutable—you can reassign them freely. But namedtuple
fields must remain immutable to preserve tuple
semantics. Achieving this read-only behavior requires a descriptor with a __set__
method that prevents modifications (or by simply not implementing a setter method).
Let's visualize how this would be if we defined our own logic with static attributes so we can see descriptors in action.
class Point:
def __init__(self, x, y):
self._x = x
self._y = y
@property
def x(self):
return self._x
@x.setter
def x(self, _):
raise AttributeError("Unable to set attr 'x'")
@property
def y(self):
return self._y
@y.setter
def y(self, _):
raise AttributeError("Unable to set attr 'y'")
p = Point(10, 20)
print(p.x)
p.x = 30
# output
10
AttributeError: Unable to set attr 'x'
We can further refine this example to get closer to the namedtuple implementation.
class Point(tuple):
def __new__(cls, x, y):
return tuple.__new__(cls, (x, y))
@property
def x(self):
return self[0]
@x.setter
def x(self, _):
raise AttributeError("Unable to set attr 'x'")
@property
def y(self):
return self[1]
@y.setter
def y(self, _):
raise AttributeError("Unable to set attr 'y'")
print(issubclass(Point, tuple))
p = Point(10, 20)
print(p.x)
p.x = 30
# output
True
10
AttributeError: Unable to set attr 'x'
Indeed, the Python implementation uses the same property approach detailed above. Here's how each field is dynamically added to the namespace:
# LOC 504 in the namedtuple factory function, see permalink below
for index, name in enumerate(field_names):
doc = _sys.intern(f'Alias for field number {index}')
class_namespace[name] = _tuplegetter(index, doc)
Then if we inspect _tuplegetter
we see:
# LOC 353 just prior to the namedtuple factory function in the collections modules
# see permalink below
_tuplegetter = lambda index, doc: property(_itemgetter(index), doc=doc)
While my example isn't dynamic nor possessing the methods of namedtuple
, we have a tuple instance with immutable properties x
and y
implemented with the property
built-in function. This should provide a basic understanding of what's generally happening in simple terms.
Providing Defaults
Often it's beneficial to provide defaults when working with namedtuple
s. Consider situations where you're using namedtuple
s to capture data retrieved from a database table. The namedtuple
is an excellent choice for simple data structures. It's lightweight, immutable and you have a handful of methods at your disposal. However, if any columns in the table have nullable fields, you have an issue. When you attempt to create an instance of your namedtuple
class, an exception will be raised for missing values.
To accommodate for this, we have three main approaches we can utilize: the defaults
parameter, updating the __new__.__defaults__
attribute directly on the class, or creating a prototype namedtuple
class instance.
First, let's take a look at utilizing the defaults
keyword-only parameter. This argument can either be a sequence of default values or set to None
(the default value). As an example, I'm going to update my Cartesian Coordinate example to have a default position of (0,0). This way, if parameters x
or y
, are missing or no argument is provided, an exception won't be raised.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'], defaults=(0, 0))
p = Point()
print(p.x)
print(p.y)
# output
0
0
Simple enough. Even without any arguments supplied, a Point
instance was still created with coordinates at the origin and no exceptions were raised.
Note, when you supply defaults, you need to be particularly careful in understanding what gets applied to where. Default values are applied to the rightmost parameters. This is consistent with Python's default behavior (no pun intended) in handling defaults.
Default values are applied to the rightmost parameters first.
For example:
Point = namedtuple('Point', ['x', 'y', 'z'], defaults=(0, 0))
In this configuration, I have three field names and two defaults. Which fields get the defaults? Only y
and z
receive defaults. The field x
will still be required and an exception will be raised if a value isn't supplied.
The next way we can supply default values is by modifying the __defaults__
attribute directly. This is sort of a shortcut to the previous example since ultimately this is what occurs anyways. You're just bypassing the factory function's implementation of defaults.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y',])
Point.__new__.__defaults__ = (0, 0)
p = Point()
print(p.x)
print(p.y)
# output
0
0
In fact, if we supply defaults both ways at the same time, you'll see that the latter method overrides the former.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'], defaults=(0, 0))
Point.__new__.__defaults__ = (1, 1)
p = Point()
print(p.x)
print(p.y)
# output
1
1
The last way to create a namedtuple
with defaults is to use the combination of a prototype namedtuple
instance and the _replace
method. We know that tuples are immutable so _replace
can't actually be replacing elements. And of course, it isn't. Instead the _replace
method returns a new namedtuple
instance with the arguments provided replacing the values of the instance the method was called on. The argument should be passed as a dictionary mapping field names and values.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
prototype_point = Point(0, 0)
p = prototype_point._replace(**{'x': 1, 'y': 1})
print(p.x)
print(p.y)
# output
1
1
I should say this is my least favorite of the options available. It's less semantic than the earlier options. I prefer using either the defaults parameter or updating the __new__.__defaults__
attribute directly. And if we're being honest the former of the two. Python's developers put that parameter there for a reason.
Last, if you'd like to retrieve what the default values are from the class or instance, you can do so by accessing the _field_defaults
attribute.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'], defaults=(0, 0))
p = Point()
print(Point._field_defaults)
print(p._field_defaults)
# output
{'x': 0, 'y': 0}
{'x': 0, 'y': 0}
Other Useful Stuff
A couple of other methods and attributes I haven't discussed yet are _make
, _asdict
, and _fields
.
The first method _make
, is useful in converting an existing tuple
to a namedtuple
that's already been structured to map correctly. For example, you have a data API that returns a tuple that happens to align with our coordinate example. We can easily convert the simpler tuple
into a namedtuple
with _make
.
The method expects a single argument containing an iterable.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'], defaults=(0, 0))
data = (10, 20)
p = Point._make(data)
print(p)
# output
Point(x=10, y=20)
Next, we have the _asdict
method. This is one of the more often used methods you'll see in production. As the name implies, this method simply maps field names with their corresponding values and constructs a new dictionary.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(10, 20)
print(p._asdict())
# output
{'x': 10, 'y': 20}
Note: prior to Python 3.8, the output was an OrderedDict
. Because from the release of Python 3.7 onward the ordering of keys in basic dictionaries is guaranteed, OrderedDict
was replaced by dict
.
And last, we have the _fields
attribute. This attribute becomes particularly useful when you want to extend an already structured namedtuple
class and you don't want to redefine it from scratch.
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
Point3D = namedtuple('Point3D', Point._fields + ('z',))
p = Point3D(10, 20, 30)
print(p)
# output
Point3D(x=10, y=20, z=30)
This might seem unnecessary for such a simple example, but imagine a namedtuple
class with many fields that would be cumbersome to redefine in addition to a DRY violation. This approach allows you to gracefully extend what's already been established.
One last thing to note, while _make
is a classmethod and _fields
is a class attribute (both accessible without an instance), _asdict
and _replace
(discussed in the section earlier) are instance methods.
Final Thoughts
While namedtuple
is a data structure you see often, it's easy to take for granted what's happening behind the curtains. This discussion peeled back many of the layers to expose some of the internals of namedtuple
as well as provide examples for common use cases.
I enjoy working with namedtuple
s. It's one of my favorite tools in the Python standard library toolkit. And I hope after this article I've left you with a greater appreciation for this powerful feature and are better equipped to utilize it in your own work.