# How to use Python’s min() and max() with nested lists

Let’s talk about using Python’s min and max functions on a list containing other lists. Sometimes this is referred to as a nested list or a lists of lists.

Finding the minimum or maximum element of a list of lists1 based on a specific property of the inner lists is a common situation that can be challenging for someone new to Python.

To give us a more concrete example to work with, let’s say we have the following list of item, weight pairs2:

```nested_list = [['cherry', 7], ['apple', 100], ['anaconda', 1360]]
```

We want Python to select the minimum and maximum element based on each item’s weight stored at index 1. We expect `min` and `max` to return the following elements:

• `min(nested_list)` should be `['cherry', 7]`
• `max(nested_list)` should be `['anaconda', 1360]`

But if we simply call `min` and `max` on that nested list we don’t get the results we expected.

The ordering we get seems to be based on the item’s name, stored at index 0:

```>>> min(nested_list)
['anaconda', 1360]  # Not what we expected!

>>> max(nested_list)
['cherry', 7]  # Not what we expected!
```

## Alright, why does it pick the wrong elements?

Let’s stop for a moment to think about how Python’s `max` function works internally. The algorithm looks something like this:

```def my_max(sequence):
"""Return the maximum element of a sequence"""
if not sequence:
raise ValueError('empty sequence')

maximum = sequence[0]

for item in sequence:
if item > maximum:
maximum = item

return maximum
```

The interesting bit of behavior here can be found in the condition that selects a new maximum: `if item > maximum:`.

This condition works nicely if `sequence` only contains primitive types like `int` or `float` because comparing those is straightforward (in the sense that it’ll give an answer that we intuitively expect; like `3 > 2`).

However, if `sequence` contains other sequences then things get a little more complex. Let’s look at the Python docs to learn how Python compares sequences:

Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted.

When `max` needs to compare two sequences to find the “larger” element then Python’s default comparison behavior might not be what we want3.

Now that we understand why we get an unexpected result we can think about ways to fix our code.

## How can we change the comparison behavior?

We need to tell `max` to compare the items differently.

In our example, Python’s `max` looks at the first item in each inner list (the string `cherry`, `apple`, or `anaconda`) and compares it with the current maximum element. That’s why it returns `cherry` as the maximum element if we just call `max(nested_list)`.

How do we tell `max` to compare the second item of each inner list?

Let’s imagine we had an updated version of `my_max` called `my_max_by_weight` that uses the second element of each inner list for comparison:

```def my_max_by_weight(sequence):
if not sequence:
raise ValueError('empty sequence')

maximum = sequence[0]

for item in sequence:
# Compare elements by their weight stored
# in their second element.
if item[1] > maximum[1]:
maximum = item

return maximum
```

That would do the trick! We can see that `my_max_by_weight` selects the maximum element we expected:

```>>> my_max_by_weight(nested_list)
['anaconda', 1360]
```

Now imagine we needed to find the maximum of different kinds of lists.

Perhaps the index (or key) we’re interested in won’t always be the second item. Maybe sometimes it’ll be the third or fourth item, or a different kind of lookup is necessary all together.

Wouldn’t it be great if we could reuse the bulk of the code in our implementation of `my_max`? Some parts of it will always work the same, for example checking if an empty sequence was passed to the function.

## How can we make max() more flexible?

Because Python allows us to treat functions as data we can extract the code selecting the comparison key into its own function. We’ll call that the key func. We can write different kinds of key funcs and pass them to `my_max` as necessary.

This gives us complete flexibility! Instead of just being able to choose a specific list index for the comparison, like index 1 or 2, we can tell our function to select something else entirely – for example, the length of the item’s name.

Let’s have a look at some code that implements this idea:

```def identity(x):
return x

def my_max(sequence, key_func=None):
"""
Return the maximum element of a sequence.
key_func is an optional one-argument ordering function.
"""
if not sequence:
raise ValueError('empty sequence')

if not key_func:
key_func = identity

maximum = sequence[0]

for item in sequence:
# Ask the key func which property to compare
if key_func(item) > key_func(maximum):
maximum = item

return maximum
```

In the code example you can see how by default we let `my_max` use a key func we called `identity`, which just uses the whole, unmodified item to do the comparison.

With `identity` as the key func we expect `my_max` to behave the same way `max` behaves.

```nested_list = [['cherry', 7], ['apple', 100], ['anaconda', 1360]]

>>> my_max(nested_list)
['cherry', 7]
```

And we can confirm that we’re still getting the same (incorrect) result as before, which is a pretty good indication that we didn’t screw up the implementation completely 😃.

Now comes the cool part – we’re going to override the comparison behavior by writing a `key_func` that returns the second sub-element instead of the element itself4:

```def weight(x):
return x[1]

>>> my_max(nested_list, key_func=weight)
['anaconda', 1360]
```

And voilà, this is the maximum element we expected to get!

Just to demonstrate the amount of flexibility this refactoring gave us, here’s a `key_func` that selects the maximum element based on the length of the item’s name:

```def name_length(x):
return len(x[0])

>>> my_max(nested_list, key_func=name_length)
['anaconda', 1360]
```

## Is there a shorthand for this stuff?

Instead of defining the key func explicitly with `def` and giving it a name we can also use Python’s `lambda` keyword to define a function anonymously. This shortens the code quite a bit (and won’t create a named function):

```my_max(nested_list, key_func=lambda x: x[1])
>>> ['anaconda', 1360]
```

To make the naming a little slicker (albeit less expressive) imagine we’ll shorten the `key_func` arg to `key` and we’ve arrived at a code snippet that works with the `max` function in vanilla Python.

This means we’ll no longer need our own re-implementation of Python’s `max` function to find the “correct” maximum element:

```# This is pure, vanilla Python:
>>> max(nested_list, key=lambda x: x[1])
['anaconda', 1360]
```

The same also works for Python’s built-in `min`:

```>>> min(nested_list, key=lambda x: x[1])
['cherry', 7]
```

It even works for Python’s `sorted` function, making the “key func” concept really valuable in a number of situations you might face as a Python developer:

```>>> sorted(nested_list, key=lambda x: x[1])
[['cherry', 7], ['apple', 100], ['anaconda', 1360]]
```

### Try it out yourself

I hope this post helped you out. What started out as a simple question ended up being a little more involved than you may have expected. But it’s often like that when you learn about new programming concepts.

Feel free to drop me a line of Twitter or over email if you got stuck anywhere. I’d love to improve this tutorial over time :)

1. Sometimes you’ll see tuples used for the inner lists. Using tuples instead of lists doesn’t really make a difference for how `min` and `max` work, but in some cases it can bring a performance benefit. Nothing we’ll have to worry about for now. The code in this tutorial will work fine on a list of tuples.

2. I actually googled these for you. Apparently the average cherry sold in a super market weighs 7 grams. I’m not 100 per cent sure about anacondas though.

3. Note that Python strings are also sequences so when you compare two strings they will be compared character by character.

4. Because this is such a common piece of functionality the Python standard library includes often-used key func implementations in the `operator` module. You may want to check out `operator.getitem`, `operator.itemgetter`, and `operator.attrgetter`

Improve Your Python with a fresh 🐍 Python Trick 💌 every couple of days