Debugging memory usage in a live Python web app

By Dan Bader — Get free updates of new posts here.

I worked on a Python web app a while ago that was struggling with using too much memory in production. A helpful technique for debugging this issue was adding a simple API endpoint that exposed memory stats while the app was running.

Enter Pympler

There’s a great module called Pympler for debugging memory stats in CPython. It walks your process heap and reports the object types, number of objects, and their size in bytes for all allocated Python objects.

The following function generates a memory summary using Pympler and returns it as a string:

def memory_summary():
    # Only import Pympler when we need it. We don't want it to
    # affect our process if we never call memory_summary.
    from pympler import summary, muppy
    mem_summary = summary.summarize(muppy.get_objects())
    rows = summary.format_(mem_summary)
    return '\n'.join(rows)

Let’s plug this into an example app that allocates some memory and then calls memory_summary:

"""
Don't forget to $ pip install pympler.
"""
import sys
from StringIO import StringIO

def memory_summary():
    # ... (see above)

# Allocate some memory
my_str = 'a' * 2**26
class MyObject(object):
    def __init__(self):
        self.memory = str(id(self)) * 2**10
my_objs = [MyObject() for _ in xrange(2**16)]

print(memory_summary())

Running this example will result in a printout like the one below, which should give you a rough idea which objects are taking up the most space in your app:

                       types |   # objects |   total size
============================ | =========== | ============
                         str |        6727 |     64.61 MB
   <class '__main__.MyObject |       65536 |      4.00 MB
                        dict |         596 |    950.84 KB
                        list |         251 |    601.54 KB
                        code |        1872 |    234.00 KB
          wrapper_descriptor |        1094 |     85.47 KB
                        type |          96 |     85.45 KB
  builtin_function_or_method |         726 |     51.05 KB
           method_descriptor |         586 |     41.20 KB
                         set |         135 |     36.59 KB
                     weakref |         386 |     33.17 KB
                       tuple |         384 |     28.27 KB
            _sre.SRE_Pattern |          42 |     19.31 KB
         <class 'abc.ABCMeta |          20 |     17.66 KB
           member_descriptor |         231 |     16.24 KB

For example, we see that the str objects we allocated take up the biggest chunk of memory at around 65 MB. And as expected, there are also 2^16 = 65536 MyObject instances, taking up 4 MB of space in total.

But how can we access this information in a production web app?

I ended up just exposing the output of memory_summary() as a /debug/memory plaintext endpoint secured with HTTP basic auth. This allowed us to access the allocation stats for the app while it was running in production.

A more advanced way to track these stats in a production web app would be to feed them into a service like DataDog to plot and track them over time. However, in many cases a simple solution like printing the stats to the application log might suffice.

Please also note that these stats are per interpreter process. If you’re running your web app as multiple CPython processes behind a load balancer (like you should) then you’ve got to be sure to take that into account when making sense of these memory stats.

Still, I found that just getting a rough sample of which objects are taking up the most space gave me a better idea of the memory usage pattern of the app and helped reduce memory consumption with some follow up work.

<strong><em>Improve Your Python</em></strong> with a fresh 🐍 <strong>Python Trick</strong> 💌 every couple of days

Improve Your Python with a fresh 🐍 Python Trick 💌 every couple of days

🔒 No spam ever. Unsubscribe any time.

This article was filed under: optimization, programming, python, and web-development.

Related Articles:

Extending Python With C Libraries and the “ctypes” Module – An end-to-end tutorial of how to extend your Python programs with libraries written in C, using the built-in “ctypes” module.
6 things you’re missing out on by never using classes in your Python code – Maybe you’ve been using Python for a while now and you’re starting to feel like you’re getting the hang of it. But one day you catch yourself thinking: “What about classes?”
The Difference Between “is” and “==” in Python – Python has two operators for equality comparisons, “is” and “==” (equals). In this article I’m going to teach you the difference between the two and when to use each with a few simple examples.
Python’s Functions Are First-Class – Python’s functions are first-class objects. You can assign them to variables, store them in data structures, pass them as arguments to other functions, and even return them as values from other functions.
Fundamental Data Structures in Python – In this article series we’ll take a tour of some fundamental data structures and implementations of abstract data types (ADTs) available in Python’s standard library.

Latest Articles:

Interfacing Python and C: The CFFI Module – How to use Python’s built-in CFFI module for interfacing Python with native libraries as an alternative to the “ctypes” approach.
Write More Pythonic Code by Applying the Things You Already Know – There’s a mistake I frequently make when I learn new things about Python… Here’s how you can avoid this pitfall and learn something about Python’s “enumerate()” function at the same time.
Working With File I/O in Python – Learn the basics of working with files in Python. How to read from files, how to write data to them, what file seeks are, and why files should be closed.
How to Reverse a String in Python – An overview of the three main ways to reverse a Python string: “slicing”, reverse iteration, and the classic in-place reversal algorithm. Also includes performance benchmarks.
Mastering Click: Writing Advanced Python Command-Line Apps – How to improve your existing Click Python CLIs with advanced features like sub-commands, user input, parameter types, contexts, and more.
Working with Random Numbers in Python – An overview for working with randomness in Python, using only functionality built into the standard library and CPython itself.

← Browse All Articles

Enter Pympler

But how can we access this information in a production web app?

🐍 Python Tricks 💌