Python Training by Dan Bader

Memoization in Python: How to Cache Function Results

Speed up your Python programs with a powerful, yet convenient, caching technique called “memoization.”

Python Memoization

In this article, I’m going to introduce you to a convenient way to speed up your Python code called memoization (also sometimes spelled memoisation):

Memoization is a specific type of caching that is used as a software optimization technique.

A cache stores the results of an operation for later use. For example, your web browser will most likely use a cache to load this tutorial web page faster if you visit it again in the future.

So, when I talk about memoization and Python, I am talking about remembering or caching a function’s output based on its inputs. Memoization finds its root word in “memorandum”, which means “to be remembered.”

Memoization allows you to optimize a Python function by caching its output based on the parameters you supply to it. Once you memoize a function, it will only compute its output once for each set of parameters you call it with. Every call after the first will be quickly retrieved from a cache.

In this tutorial, you’ll see how and when to wield this simple but powerful concept with Python, so you can use it to optimize your own programs and make them run much faster in some cases.

Why and When Should You Use Memoization in Your Python Programs?

The answer is expensive code:

When I am analyzing code, I look at it in terms of how long it takes to run and how much memory it uses. If I’m looking at code that takes a long time to run or uses a lot of memory, I call the code expensive.

It’s expensive code because it costs a lot of resources, space and time, to run. When you run expensive code, it takes resources away from other programs on your machine.

If you want to speed up the parts in your Python application that are expensive, memoization can be a great technique to use. Let’s take a deeper look at memoization before we get our hands dirty and implement it ourselves!

All code examples I use in this tutorial were written in Python 3, but of course the general technique and patterns demonstrated here apply just as well to Python 2.

The Memoization Algorithm Explained

The basic memoization algorithm looks as follows:

  1. Set up a cache data structure for function results
  2. Every time the function is called, do one of the following:
    • Return the cached result, if any; or
    • Call the function to compute the missing result, and then update the cache before returning the result to the caller

Given enough cache storage this virtually guarantees that function results for a specific set of function arguments will only be computed once.

As soon as we have a cached result we won’t have to re-run the memoized function for the same set of inputs. Instead, we can just fetch the cached result and return it right away.

Let’s Write a Memoization Decorator From Scratch

Next, I’m going to implement the above memoization algorithm as a Python decorator, which is a convenient way to implement generic function wrappers in Python:

A decorator is a function that takes another function as an input and has a function as its output.

This allows us to implement our memoization algorithm in a generic and reusable way. Sounds a little confusing? No worries, we’ll take this step-by-step and it will all become clearer when you see some real code.

Here’s the memoize() decorator that implements the above caching algorithm:

def memoize(func):
    cache = dict()

    def memoized_func(*args):
        if args in cache:
            return cache[args]
        result = func(*args)
        cache[args] = result
        return result

    return memoized_func

This decorator takes a function and returns a wrapped version of the same function that implements the caching logic (memoized_func).

I’m using a Python dictionary as a cache here. In Python, using a key to look-up a value in a dictionary is quick. This makes dict a good choice as the data structure for the function result cache.

Whenever the decorated function gets called, we check if the parameters are already in the cache. If they are, then the cached result is returned. So, instead of re-computing the result, we quickly return it from the cache.

Bam, memoization!

If the result isn’t in the cache, we must update the cache so we can save some time in the future. Therefore, we first compute the missing result, store it in the cache, and then return it to the caller.

[ As I mentioned, decorators are an important concept to master for any intermediate or advanced Python developer. Check out my Python decorators tutorial for a step-by-step introduction if you’d like to know more. ]

Let’s test our memoization decorator out on a recursive Fibonacci sequence function. First, I’ll define a Python function that calculates the n-th Fibonacci number:

def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    return fibonacci(n - 1) + fibonacci(n - 2)

This fibonacci function will serve as an example of an “expensive” computation. Calculating the n-th Fibonacci number this way has O(2^n) time complexity—it takes exponential time to complete.

This makes it quite an expensive function indeed.

Next up, I’m going to do some benchmarking in order to get a feel for how computationally expensive this function is. Python’s built-in timeit module lets me measure the execution time in seconds of an arbitrary Python statement.

Here’s how I’ll measure the execution time of the fibonacci function I just defined using Python’s built-in timeit module:

>>> import timeit
>>> timeit.timeit('fibonacci(35)', globals=globals(), number=1)
5.1729652720096055

As you can see, on my machine, it takes about five seconds to compute the 35th number in the Fibonacci sequence. That’s a pretty slow and expensive operation right there.

⏰ Sidebar: timeit.timeit Arguments

Python’s built-in timeit module lets me measure the execution time in seconds of an arbitrary Python statement. Here’s a quick note on the arguments I’m passing to timeit.timeit in the above example:

  • Because I’m running this benchmark in a Python interpreter (REPL) session I need to set up the environment for this benchmark run by setting globals to the current set of global variables retrieved with the globals() built-in.

  • By default timeit() will repeat the benchmark several times to make the measured execution time more accurate. But because a single fibonacci(35) call already takes a few seconds to execute I’m limiting the number of executions to one with the number argument. For this experiment I’m interested in ballpark timing figures and millisecond accuracy isn’t needed.

Let’s see if we can speed it up by leveraging the function result caching provided by our memoization decorator:

>>> memoized_fibonacci = memoize(fibonacci)
>>> timeit.timeit('memoized_fibonacci(35)', globals=globals(), number=1)
4.941958484007046

The memoized function still takes about five seconds to return on the first run. So far, so underwhelming…

We’ll get a similar execution time because the first time I ran the memoized function the result cache was cold—we started out with an empty cache which means there were no pre-computed results that could help speed up this function call.

Let’s run our benchmark a second time:

>>> timeit.timeit('memoized_fibonacci(35)', globals=globals(), number=1)
1.9930012058466673e-06

Now we’re talking!

Notice the e-06 suffix at the end of that floating point number? The second run of memoized_fibonacci took only about 2 microseconds to complete. That’s 0.0000019930012058466673 seconds—quite a nice speedup indeed!

Instead of recursively calculating the 35th Fibonacci number our memoize decorator simply fetched the cached result and returned it immediately, and this is what led to the incredible speedup in the second benchmarking run.

Inspecting the Function Results Cache

To really drive home how memoization works “behind the scenes” I want to show you the contents of the function result cache used in the previous example:

>>> memoized_fibonacci.__closure__[0].cell_contents
{(35,): 9227465}

To inspect the cache I reached “inside” the memoized_fibonacci function using its __closure__ attribute. The cache dict is the first local variable and stored in cell 0. I wouldn’t recommend that you use this technique in production code—but here it makes for a nice little debugging trick 🙂

As you can see, the cache dictionary maps the argument tuples for each memoized_fibonacci function call that happened so far to the function result (the n-th Fibonacci number.)

So, for example, (35,) is the argument tuple for the memoized_fibonacci(35) function call and it’s associated with 9227465 which is the 35th Fibonacci number:

>>> fibonacci(35)
9227465

Let’s do a nother little experiment to demonstrate how the function result cache works. I’ll call memoized_fibonacci a few more times to populate the cache and then we’ll inspect its contents again:

>>> memoized_fibonacci(1)
1
>>> memoized_fibonacci(2)
1
>>> memoized_fibonacci(3)
2
>>> memoized_fibonacci(4)
3
>>> memoized_fibonacci(5)
5

>>> memoized_fibonacci.__closure__[0].cell_contents
{(35,): 9227465, (1,): 1, (2,): 1, (3,): 2, (4,): 3, (5,): 5}

As you can see, the cache dictionary now also contains cached results for several other inputs to the memoized_fibonacci function. This allows us to retrieve these results quickly from the cache instead of slowly re-computing them from scratch.

A quick word of warning on the naive caching implementation in our memoize decorator: In this example the cache size is unbounded, which means the cache can grow at will. This is usually not a good idea because it can lead to memory exhaustion bugs in your programs.

With any kind of caching that you use in your programs, it makes sense to put a limit on the amount of data that’s kept in the cache at the same time. This is typically achieved either by having a hard limit on the cache size or by defining an expiration policy that evicts old items from the cache at some point.

Please keep in mind that the memoize function we wrote earlier is a simplified implementation for demonstration purposes. In the next section in this tutorial you’ll see how to use a “production-ready” implementation of the memoization algorithm in your Python programs.

Python Memoization with functools.lru_cache

Now that you’ve seen how to implement a memoization function yourself, I’ll show you you can achieve the same result using Python’s functools.lru_cache decorator for added convenience.

One of the things I love the most about Python is that the simplicity and beauty of its syntax goes hand in hand with beauty and simplicity of its philosophy. Python is “batteries included”, which means that Python is bundled with loads of commonly used libraries and modules which are only an import statement away!

I find functools.lru_cache to be a great example of this philosophy. The lru_cache decorator is the Python’s easy to use memoization implementation from the standard library. Once you recognize when to use lru_cache, you can quickly speed up your application with just a few lines of code.

Let’s revisit our Fibonacci sequence example. This time I’ll show you how to add memoization using the functools.lru_cache decorator:

import functools

@functools.lru_cache(maxsize=128)
def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    return fibonacci(n - 1) + fibonacci(n - 2)

Note the maxsize argument I’m passing to lru_cache to limit the number of items stored in the cache at the same time.

Once again I’m using the timeit module to run a simple benchmark so I can get a sense of the performance impact of this optimization:

>>> import timeit
>>> timeit.timeit('fibonacci(35)', globals=globals(), number=1)
3.056201967410743e-05
>>> timeit.timeit('fibonacci(35)', globals=globals(), number=1)
1.554988557472825e-06

You may be wondering why we’re getting the result of the first run so much faster this time around. Shouldn’t the cache be “cold” on the first run as well?

The difference is that, in this example, I applied the @lru_cache decorator at function definition time. This means that recursive calls to fibonacci() are also looked up in the cache this time around.

By decorating the fibonacci() function with the @lru_cache decorator I basically turned it into a dynamic programming solution, where each subproblem is solved just once by storing the subproblem solutions and looking them up from the cache the next time.

This is just a side-effect in this case—but I’m sure you can begin to see the beauty and the power of using a memoization decorator and how helpful a tool it can be to implement other dynamic programming algorithms as well.

Why You Should Prefer functools.lru_cache

In general, Python’s memoization implementation provided by functools.lru_cache is much more comprehensive than our ad hoc memoize function, as you can see in the CPython source code.

For example, it provides a handy feature that allows you to retrieve caching statistics with the cache_info method:

>>> fibonacci.cache_info()
CacheInfo(hits=34, misses=36, maxsize=None, currsize=36)

Again, as you can see in the CacheInfo output, Python’s lru_cache() memoized the recursive calls to fibonacci(). When we look at the cache information for the memoized function, you’ll recognize why it is faster than our version on the first run—the cache was hit 34 times.

As I hinted at earlier, functools.lru_cache also allows you to limit the number of cached results with the maxsize parameter. By setting maxsize=None you can force the cache to be unbounded, which I would usually recommend against.

There’s also a typed boolean parameter you can set to True in order to tell the cache that function arguments of different types should be cached separately. For example, fibonacci(35) and fibonacci(35.0) would be treated as distinct calls with distinct results.

Another useful feature is the ability to reset the result cache at any time with the cache_clear method:

>>> fibonacci.cache_clear()
>>> fibonacci.cache_info()
CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)

If you want to learn more about the intricacies of using the lru_cache decorator I recommend that you consult the Python standard library documentation.

In summary, you should never need to roll your own memoizing function. Python’s built-in lru_cache() is readily-available, more comprehensive, and battle-tested.

Caching Caveats – What Can Be Memoized?

Ideally, you will want to memoize functions that are deterministic.

def deterministic_adder(x, y):
    return x + y

Here deterministic_adder() is a deterministic function because it will always return the same result for the same pair of parameters. For example, if you pass 2 and 3 into the function, it will always return 5.

Compare this behavior with the following nondeterministic function:

from datetime import datetime

def nondeterministic_adder(x, y):
    # Check to see if today is Monday (weekday 0)
    if datetime.now().weekday() == 0:
        return x + y + x
    return x + y

This function is nondeterministic because its output for a given input will vary depending on the day of the week: If you run this function on Monday, the cache will return stale data any other day of the week.

Generally I find that any function that updates a record or returns information that changes over time is a poor choice to memoize.

Or, as Phil Karlton puts it:

There are only two hard things in Computer Science: cache invalidation and naming things.

— Phil Karlton

🙂

Memoization in Python: Quick Summary

In this Python tutorial you saw how memoization allows you to optimize a function by caching its output based on the parameters you supply to it.

Once you memoize a function, it will only compute its output once for each set of parameters you call it with. Every call after the first will be quickly retrieved from a cache.

You saw how to write your own memoization decorator from scratch, and why you probably want to use Python’s built-in lru_cache() battle-tested implementation in your production code:

  • Memoization is a software optimization technique that stores and return the result of a function call based on its parameters.
  • If your code meets a certain criteria, memoization can be a great method to speed up your application.
  • You can import a comprehensive memoization function, lru_cache(), from Python’s standard library in the functools module.

<strong><em>Improve Your Python</em></strong> with a fresh 🐍 <strong>Python Trick</strong> 💌 every couple of days

Improve Your Python with a fresh 🐍 Python Trick 💌 every couple of days

🔒 No spam ever. Unsubscribe any time.

This article was filed under: optimization, and python.

Related Articles:
Python T-Shirts &amp; Hoodies

Python T-Shirts & Hoodies
Your Coworkers Wil Be Mad With Envy
» Browse Python Apparel at Nerdlettering.com

Latest Articles:
Improve Your Python with 🐍 Python Tricks 💌

Improve Your Python with 🐍 Python Tricks 💌
Get a short & sweet Python code snippet delivered to your inbox every couple of days:
» Click here to see examples

← Browse All Articles