How to use Python’s min() and max() with nested lists

By Dan Bader — Get free updates of new posts here.

Let’s talk about using Python’s min and max functions on a list containing other lists. Sometimes this is referred to as a nested list or a lists of lists.

Finding the minimum or maximum element of a list of lists¹ based on a specific property of the inner lists is a common situation that can be challenging for someone new to Python.

To give us a more concrete example to work with, let’s say we have the following list of item, weight pairs²:

nested_list = [['cherry', 7], ['apple', 100], ['anaconda', 1360]]

We want Python to select the minimum and maximum element based on each item’s weight stored at index 1. We expect min and max to return the following elements:

min(nested_list) should be ['cherry', 7]
max(nested_list) should be ['anaconda', 1360]

But if we simply call min and max on that nested list we don’t get the results we expected.

The ordering we get seems to be based on the item’s name, stored at index 0:

>>> min(nested_list)
['anaconda', 1360]  # Not what we expected!

>>> max(nested_list)
['cherry', 7]  # Not what we expected!

Alright, why does it pick the wrong elements?

Let’s stop for a moment to think about how Python’s max function works internally. The algorithm looks something like this:

def my_max(sequence):
    """Return the maximum element of a sequence"""
    if not sequence:
        raise ValueError('empty sequence')

    maximum = sequence[0]

    for item in sequence:
        if item > maximum:
            maximum = item

    return maximum

The interesting bit of behavior here can be found in the condition that selects a new maximum: if item > maximum:.

This condition works nicely if sequence only contains primitive types like int or float because comparing those is straightforward (in the sense that it’ll give an answer that we intuitively expect; like 3 > 2).

However, if sequence contains other sequences then things get a little more complex. Let’s look at the Python docs to learn how Python compares sequences:

Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted.

When max needs to compare two sequences to find the “larger” element then Python’s default comparison behavior might not be what we want³.

Now that we understand why we get an unexpected result we can think about ways to fix our code.

How can we change the comparison behavior?

We need to tell max to compare the items differently.

In our example, Python’s max looks at the first item in each inner list (the string cherry, apple, or anaconda) and compares it with the current maximum element. That’s why it returns cherry as the maximum element if we just call max(nested_list).

How do we tell max to compare the second item of each inner list?

Let’s imagine we had an updated version of my_max called my_max_by_weight that uses the second element of each inner list for comparison:

def my_max_by_weight(sequence):
    if not sequence:
        raise ValueError('empty sequence')

    maximum = sequence[0]

    for item in sequence:
        # Compare elements by their weight stored
        # in their second element.
        if item[1] > maximum[1]:
            maximum = item

    return maximum

That would do the trick! We can see that my_max_by_weight selects the maximum element we expected:

>>> my_max_by_weight(nested_list)
['anaconda', 1360]

Now imagine we needed to find the maximum of different kinds of lists.

Perhaps the index (or key) we’re interested in won’t always be the second item. Maybe sometimes it’ll be the third or fourth item, or a different kind of lookup is necessary all together.

Wouldn’t it be great if we could reuse the bulk of the code in our implementation of my_max? Some parts of it will always work the same, for example checking if an empty sequence was passed to the function.

How can we make max() more flexible?

Because Python allows us to treat functions as data we can extract the code selecting the comparison key into its own function. We’ll call that the key func. We can write different kinds of key funcs and pass them to my_max as necessary.

This gives us complete flexibility! Instead of just being able to choose a specific list index for the comparison, like index 1 or 2, we can tell our function to select something else entirely – for example, the length of the item’s name.

Let’s have a look at some code that implements this idea:

def identity(x):
    return x

def my_max(sequence, key_func=None):
    """
    Return the maximum element of a sequence.
    key_func is an optional one-argument ordering function.
    """
    if not sequence:
        raise ValueError('empty sequence')

    if not key_func:
        key_func = identity

    maximum = sequence[0]

    for item in sequence:
        # Ask the key func which property to compare
        if key_func(item) > key_func(maximum):
            maximum = item

    return maximum

In the code example you can see how by default we let my_max use a key func we called identity, which just uses the whole, unmodified item to do the comparison.

With identity as the key func we expect my_max to behave the same way max behaves.

nested_list = [['cherry', 7], ['apple', 100], ['anaconda', 1360]]

>>> my_max(nested_list)
['cherry', 7]

And we can confirm that we’re still getting the same (incorrect) result as before, which is a pretty good indication that we didn’t screw up the implementation completely 😃.

Now comes the cool part – we’re going to override the comparison behavior by writing a key_func that returns the second sub-element instead of the element itself⁴:

def weight(x):
    return x[1]

>>> my_max(nested_list, key_func=weight)
['anaconda', 1360]

And voilà, this is the maximum element we expected to get!

Just to demonstrate the amount of flexibility this refactoring gave us, here’s a key_func that selects the maximum element based on the length of the item’s name:

def name_length(x):
    return len(x[0])

>>> my_max(nested_list, key_func=name_length)
['anaconda', 1360]

Is there a shorthand for this stuff?

Instead of defining the key func explicitly with def and giving it a name we can also use Python’s lambda keyword to define a function anonymously. This shortens the code quite a bit (and won’t create a named function):

my_max(nested_list, key_func=lambda x: x[1])
>>> ['anaconda', 1360]

To make the naming a little slicker (albeit less expressive) imagine we’ll shorten the key_func arg to key and we’ve arrived at a code snippet that works with the max function in vanilla Python.

This means we’ll no longer need our own re-implementation of Python’s max function to find the “correct” maximum element:

# This is pure, vanilla Python:
>>> max(nested_list, key=lambda x: x[1])
['anaconda', 1360]

The same also works for Python’s built-in min:

>>> min(nested_list, key=lambda x: x[1])
['cherry', 7]

It even works for Python’s sorted function, making the “key func” concept really valuable in a number of situations you might face as a Python developer:

>>> sorted(nested_list, key=lambda x: x[1])
[['cherry', 7], ['apple', 100], ['anaconda', 1360]]

<strong><em>Improve Your Python</em></strong> with a fresh 🐍 <strong>Python Trick</strong> 💌 every couple of days

Improve Your Python with a fresh 🐍 Python Trick 💌 every couple of days

🔒 No spam ever. Unsubscribe any time.

This article was filed under: programming, and python.

Related Articles:

Using get() to return a default value from a Python dict – Python’s dictionaries have a “get” method to look up a key while providing a fallback value. This short screencast tutorial gives you a real-world example where this might come in handy.
The Difference Between “is” and “==” in Python – Python has two operators for equality comparisons, “is” and “==” (equals). In this article I’m going to teach you the difference between the two and when to use each with a few simple examples.
Python’s Functions Are First-Class – Python’s functions are first-class objects. You can assign them to variables, store them in data structures, pass them as arguments to other functions, and even return them as values from other functions.
Python Decorators: A Step-By-Step Introduction – Understanding decorators is a milestone for any serious Python programmer. Here’s your step-by-step guide to how decorators can help you become a more efficient and productive Python developer.
Linked Lists in Python – Learn how to implement a linked list data structure in Python, using only built-in data types and functionality from the standard library.

Latest Articles:

Interfacing Python and C: The CFFI Module – How to use Python’s built-in CFFI module for interfacing Python with native libraries as an alternative to the “ctypes” approach.
Write More Pythonic Code by Applying the Things You Already Know – There’s a mistake I frequently make when I learn new things about Python… Here’s how you can avoid this pitfall and learn something about Python’s “enumerate()” function at the same time.
Working With File I/O in Python – Learn the basics of working with files in Python. How to read from files, how to write data to them, what file seeks are, and why files should be closed.
How to Reverse a String in Python – An overview of the three main ways to reverse a Python string: “slicing”, reverse iteration, and the classic in-place reversal algorithm. Also includes performance benchmarks.
Mastering Click: Writing Advanced Python Command-Line Apps – How to improve your existing Click Python CLIs with advanced features like sub-commands, user input, parameter types, contexts, and more.
Working with Random Numbers in Python – An overview for working with randomness in Python, using only functionality built into the standard library and CPython itself.

← Browse All Articles

Alright, why does it pick the wrong elements?

How can we change the comparison behavior?

How can we make max() more flexible?

Is there a shorthand for this stuff?

🐍 Python Tricks 💌