Background
A closure can be understood as the combination of a function and variables from the enclosing function's scope. We'll see what that means in a moment. Because closures are a common pattern in functional programming that is capable of providing some of the conveniences of object-oriented programming with often less overhead, you'll frequently encounter and use them. Let's take a deeper look at what closures are as well as a couple common implementations.
A Simple Closure
You might see a closure defined as a function which retains its lexical scope. While technically correct, that probably doesn't mean a whole lot when you're first learning about closures. Simply put, a closure is a function returned from another that references one or more free variables—variables from an enclosing function's scope. Let's see this with a simple example.
def outer():
audience = 'world'
def inner():
return f'Hello {audience}!'
return inner
This function outer
is a higher-order function—a function that accepts a function, returns a function, or both. In our case, outer
returns inner
. Additionally, outer
provides a variable audience
which inner
references. When outer
is called and returns inner
, the reference to audience
is what establishes the closure. This can be demonstrated with a little code introspection.
greeting = outer()
print(greeting.__code__.co_freevars)
print(greeting.__closure__)
print(greeting())
# output
('audience',)
(<cell at 0x16425b7f0: str object at 0x10f445c20>,)
Hello world!
If this pattern didn't result in a closure, __code__.co_freevars
and __closure__
wouldn't be populated. If we move audience
into inner
, we'll see this is the case.
def outer():
def inner():
audience = 'world'
return f'Hello {audience}!'
return inner
greeting = outer()
print(greeting.__code__.co_freevars)
print(greeting.__closure__)
print(greeting())
# output
()
None
Hello world!
Now that we understand the basic pattern of closures, let's take a look at a couple common uses that demonstrate their power and utility.
Common Uses
While there's no exhaustive list of closure patterns, some appear more often than others. This includes:
- Constructor
- State Encapsulation
- Caching
Other patterns you'll see are decorators, partials, callbacks and so on. However, I would consider these patterns extensions of closures rather than simple implementations of closures so I'll save that for narrower discussions.
Constructor
The constructor or prototype closure is generally structured as an outer function that accepts arguments used to initialize the inner function that's returned.
We can restructure our earlier example to demonstrate this pattern.
def outer(audience: str = 'world'):
def inner():
return f'Hello {audience}!'
return inner
greeting = outer()
print(greeting.__code__.co_freevars)
print(greeting.__closure__)
print(greeting())
# output
('audience',)
(<cell at 0x1642a32e0: str object at 0x10f445c20>,)
Hello world!
Although we didn't define the variable within the enclosing scope, the outer function's parameters are still locally scoped variables just as before. We can see from the output that a closure does indeed exist.
But this example isn't too practical for use as a constructor considering a closure isn't necessary at all. The inner function could just as easily define the audience
parameter and we'd be in the same spot without a closure.
Let's take a look at a situation next where the enclosing scope does provide value in constructing other functions.
def power(n):
def inner(x):
return x ** n
return inner
squared = power(2)
cubed = power(3)
print(squared(2))
print(cubed(2))
# output
4
8
Here we have an outer function power
that accepts an argument n
used to set up the inner function's mathematical logic. While this implementation is quite simple, you can get an idea for how this pattern can be used to reduce or eliminate duplicate code.
Before moving on, I want to highlight a couple of key points.
The outer-most function is generally semantically named in practice rather than the generic "outer" convention I've been using up until this point. Choose something meaningful that communicates what the higher-order function is used for.
In contrast, the inner function is generally named something generic like "inner" because its name will never be accessed downstream. In the case of decorators or other wrapping functions, you'll often see the name "wrapper" used in place of "inner." To demonstrate the insignificance of this name, let's restructure the previous example to use an anonymous lambda
function.
def power(n):
return lambda x: x ** n
squared = power(2)
cubed = power(3)
print(squared(2))
print(cubed(2))
# output
4
8
Even without a name, the functionality is completely unaffected.
State Encapsulation
Next, we'll take a look at how we can maintain state between function calls. Consider a scenario where we want to be able to count how many times a function's been called. We could certainly use a global variable. Note that to do so we need to let the compiler know count
is a global by using the global
statement or we'll get an exception for an undefined local.
count = 0
def counter():
global count
count += 1
print(f"I've been called {count} time{'s' if count > 1 else ''}!")
counter()
counter()
counter()
# output
I've been called 1 time!
I've been called 2 times!
I've been called 3 times!
This works and is technically valid. However, it has many downsides.
First, there isn't reusability. We have a single counter and would have to define a separate function if we needed another.
count = 0
def counter1():
global count
count += 1
print(f"I've been called {count} time{'s' if count > 1 else ''}!")
def counter2():
global count
count += 1
print(f"I've been called {count} time{'s' if count > 1 else ''}!")
Even if we did define a factory function, state would be shared between these two functions.
count = 0
def make_counter():
def inner():
global count
count += 1
print(f"I've been called {count} time{'s' if count > 1 else ''}!")
return inner
counter1 = make_counter()
counter2 = make_counter()
counter1()
counter2()
That's probably not desirable since these are independent functions.
Lastly, we're polluting the global/module namespace.
print(globals())
# output
{...
'count': 2,
'make_counter': <function __main__.make_counter()>,
'counter1': <function __main__.make_counter.<locals>.inner()>,
'counter2': <function __main__.make_counter.<locals>.inner()>}
Consider still that the above is a non-exhaustive list, but for brevity we'll move on. You get a sense of the issues at play.
Instead, we can write a closure to contain our count
variable, appropriately maintain state where expected, and prevent pollution in the broader namespace.
def make_counter():
count = 0
def inner():
nonlocal count
count += 1
print(f"I've been called {count} time{'s' if count > 1 else ''}!")
return inner
counter1 = make_counter()
counter2 = make_counter()
print('Counter 1:')
counter1()
counter1()
print('\nCounter 2:')
counter2()
counter2()
# output
Counter 1:
I've been called 1 time!
I've been called 2 times!
Counter 2:
I've been called 1 time!
I've been called 2 times!
State is maintained between function calls but remains independent between functions.
Note, instead of the global
statement here we need to use the nonlocal
statement. What's different in this example compared to the earlier example not using either statement? The earlier examples in the "Constructor" and "Background" sections were only reads on the variables. The compiler allows accessing outer scope variables without issue. However, writes are an entirely different story. If you wish to use an assignment ("="), these statements must be used or an exception will be raised for an undefined local.
Caching
The last application of closures we'll take a look at is function result caching, also called "memoization." To demonstrate, consider the mathematical factorial function which can be defined as n! = n * (n - 1)!
—a recursive function.
def fact(n):
print(f'Calculating factorial for {n}.')
if n == 1:
return 1
return n * fact(n - 1)
print(fact(2))
print(fact(3))
# output
Calculating factorial for 2.
Calculating factorial for 1.
2
Calculating factorial for 3.
Calculating factorial for 2.
Calculating factorial for 1.
6
Because the function is recursive, function calls are unnecessarily repeated between calculating factorial of 2 and factorial of 3. If we were to cache, or "memoize," the outputs of the earlier calculation, we'd only need to run the function once more for n = 3
. Let's see how to do that.
def get_fact():
cache = {1: 1}
def inner(n):
if n in cache:
return cache[n]
print(f'Calculating factorial for {n}.')
cache[n] = n * inner(n - 1)
return cache[n]
return inner
fact = get_fact()
print(fact(2))
print(fact(3))
# output
Calculating factorial for 2.
2
Calculating factorial for 3.
6
You can see from the steps printed to console that we've reduced our function calls from 5 down to 2 just by implementing a caching system! Pretty cool! We've now increased performance and decreased the load on our resources. Now let's discuss what's happening.
We nest our function within a factory function named "get_fact." When called, it will return our modified function fact
now renamed to "inner" for reasons earlier addressed.
Next, the "cache" is structured as a simple dictionary where the key is the value for n
and the value in the dictionary is the output of the inner function call for a given n
. The cache
object is pre-populated with our exit condition of n=1
so we can remove this from our recursive function saving a call.
Then within inner
, we first check if n exists in the cache
. If so, we return that value instead of calculating it. If n
is not present in the cache
dictionary, we proceed with the calculation.
Finally, we return the inner function like with earlier demonstrations of closures.
Note that the nonlocal
statement isn't necessary here. That's because no assignment to cache is occurring. Remember dictionary objects are mutable. If we were to check IDs, we'd see that we'd start and end with the exact same object. If we were to replace the dictionary with something like cache = {**cache, n: ...}
, then we would need to use the nonlocal
statement. However, that would not be an optimal implementation of a dictionary in Python so please don't do that.
Memory Risks
This cache is structured without any upper limits. For any n, the result will be stored and consume incremental memory.
Note, the way the cache
dictionary is structured here is potentially dangerous. It will continue to grow for each n, potentially consuming too much memory if n
becomes very large. You should consider limiting how many items are cached. Even better, use the functools.lru_cache
decorator which is part of the standard library and has a default upper limit of 128 items at the time of this writing. There's also the functools.cache
decorator which does not have an upper limit by default.
Final Thoughts
Closures are a great and necessary tool for any software engineer's toolkit. Even if you've never written one or even seen one before, you've probably still used one if you've ever used a decorator (most of our first encounters with closures). In this article we explored the basic concepts as well as three common closure patterns to construct functions from a prototype, maintain state, and cache function calls. With this knowledge, you'll now be able to dig into more complex patterns like decorators!