Python Training by Dan Bader

Unpacking Nested Data Structures in Python

A tutorial on Python’s advanced data unpacking features: How to unpack data with the “=” operator and for-loops.

Python Nested Data Structures Unpacking

Have you ever seen Python’s enumerate function being used like this?

for (i, value) in enumerate(values):
   ...

In Python, you can unpack nested data structures in sophisticated ways, but the syntax might seem complicated: Why does the for statement have two variables in this example, and why are they written inside parentheses?

This article answers those questions and many more. I wrote it in two parts:

  • First, you’ll see how Python’s “=” assignment operator iterates over complex data structures. You’ll learn about the syntax of multiple assignments, recursive variable unpacking, and starred targets.

  • Second, you’ll discover how the for-statement unpacks data using the same rules as the = operator. Again, we’ll go over the syntax rules first and then dive into some hands-on examples.

Ready? Let’s start with a quick primer on the “BNF” syntax notation used in the Python language specification.

BNF Notation – A Primer for Pythonistas

This section is a bit technical, but it will help you understand the examples to come. The Python 2.7 Language Reference defines all the rules for the assignment statement using a modified form of Backus Naur notation.

The Language Reference explains how to read BNF notation. In short:

  • symbol_name ::= starts the definition of a symbol
  • ( ) is used to group symbols
  • * means appearing zero or more times
  • + means appearing one or more times
  • (a|b) means either a or b
  • [ ] means optional
  • "text" means the literal text. For example, "," means a literal comma character.

Here is the complete grammar for the assignment statement in Python 2.7. It looks a little complicated because Python allows many different forms of assignment:

An assignment statement consists of

  • one or more (target_list "=") groups
  • followed by either an expression_list or a yield_expression
assignment_stmt ::= (target_list "=")+ (expression_list | yield_expression)

A target list consists of

  • a target
  • followed by zero or more ("," target) groups
  • followed by an optional trailing comma
target_list ::= target ("," target)* [","]

Finally, a target consists of any of the following

  • a variable name
  • a nested target list enclosed in ( ) or [ ]
  • a class or instance attribute
  • a subscripted list or dictionary
  • a list slice
target ::= identifier
           | "(" target_list ")"
           | "[" [target_list] "]"
           | attributeref
           | subscription
           | slicing

As you’ll see, this syntax allows you to take some clever shortcuts in your code. Let’s take a look at them now:

#1 – Unpacking and the “=” Assignment Operator

First, you’ll see how Python’s “=” assignment operator iterates over complex data structures. You’ll learn about the syntax of multiple assignments, recursive variable unpacking, and starred targets.

Multiple Assignments in Python:

Multiple assignment is a shorthand way of assigning the same value to many variables. An assignment statement usually assigns one value to one variable:

x = 0
y = 0
z = 0

But in Python you can combine these three assignments into one expression:

x = y = z = 0

Recursive Variable Unpacking:

I’m sure you’ve written [ ] and ( ) on the right side of an assignment statement to pack values into a data structure. But did you know that you can literally flip the script by writing [ ] and ( ) on the left side?

Here’s an example:

[target, target, target, ...] =
or
(target, target, target, ...) =

Remember, the grammar rules allow [ ] and ( ) characters as part of a target:

target ::= identifier
           | "(" target_list ")"
           | "[" [target_list] "]"
           | attributeref
           | subscription
           | slicing

Packing and unpacking are symmetrical and they can be nested to any level. Nested objects are unpacked recursively by iterating over the nested objects and assigning their values to the nested targets.

Here’s what this looks like in action:

(a, b) = (1, 2)
# a == 1
# b == 2

(a, b) = ([1, 2], [3, 4])
# a == [1, 2]
# b == [3, 4]

(a, [b, c]) = (1, [2, 3])
# a == 1
# b == 2
# c == 3

Unpacking in Python is powerful and works with any iterable object. You can unpack:

  • tuples
  • lists
  • dictionaries
  • strings
  • ranges
  • generators
  • comprehensions
  • file handles.

Test Your Knowledge: Unpacking

What are the values of a, x, y, and z in the example below?

a = (x, y, z) = 1, 2, 3

Hint: this expression uses both multiple assignment and unpacking.

Starred Targets (Python 3.x Only):

In Python 2.x the number of targets and values must match. This code will produce an error:

x, y, z = 1, 2, 3, 4   # Too many values

Python 3.x introduced starred variables. Python first assigns values to the unstarred targets. After that, it forms a list of any remaining values and assigns it to the starred variable. This code does not produce an error:

x, *y, z = 1, 2, 3, 4
# y == [2,3]

Test Your Knowledge: Starred Variables

Is there any difference between the variables b and *b in these two statements? If so, what is it?

(a, b, c) = 1, 2, 3
(a, *b, c) = 1, 2, 3

#2 – Unpacking and for-loops

Now that you know all about target list assignment, it’s time to look at unpacking used in conjunction with for-loops.

In this section you’ll see how the for-statement unpacks data using the same rules as the = operator. Again, we’ll go over the syntax rules first and then we’ll look at a few hands-on examples.

Let’s examine the syntax of the for statement in Python:

for_stmt ::= "for" target_list "in" expression_list ":" suite
             ["else" ":" suite]

Do the symbols target_list and expression_list look familiar? You saw them earlier in the syntax of the assignment statement.

This has massive implications:

Everything you’ve just learned about assignments and nested targets also applies to for loops!

Standard Rules for Assignments:

Let’s take another look at the standard rules for assignments in Python. The Python Language Reference says:

The for statement is used to iterate over the elements of a sequence (such as a string, tuple or list) or other iterable objects … Each item, in turn, is assigned to the target list using the standard rules for assignments.

You already know the standard rules for assignments. You learned them earlier when we talked about the = operator. They are:

  • assignment to a single target
  • assignment to multiple targets
  • assignment to a nested target list
  • assignment to a starred variable (Python 3.x only)

In the introduction, I promised I would explain this code:

for (i,value) in enumerate(values):
   ...

Now you know enough to figure it out yourself:

  • enumerate returns a sequence of (number, item) tuples
  • when Python sees the target list (i,value) it unpacks (number, item) tuple into the target list.

Examples:

I’ll finish by showing you a few more examples that use Python’s unpacking features with for-loops. Here’s some test data we’ll use in this section:

# Test data:
negative_numbers = (-1, -2, -3, -4, -5)
positive_numbers = (1, 2, 3, 4, 5)

The built-in zip function returns pairs of numbers:

>>> list(zip(negative_numbers, positive_numbers))
[(-1, 1), (-2, 2), (-3, 3), (-4, 4), (-5, 5)]

I can loop over the pairs:

for z in zip(negative_numbers, positive_numbers):
    print(z)

Which produces this output:

(-1, 1)
(-2, 2)
(-3, 3)
(-4, 4)
(-5, 5)

I can also unpack the pairs if I wish:

>>> for (neg, pos) in zip(negative_numbers, positive_numbers):
...     print(neg, pos)

-1 1
-2 2
-3 3
-4 4
-5 5

What about starred variables? This example finds a string’s first and last character. The underscore character is often used in Python when we need a dummy placeholder variable:

>>> animals = [
...    'bird',
...    'fish',
...    'elephant',
... ]

>>> for (first_char, *_, last_char) in animals:
...    print(first_char, last_char)

b d
f h
e t

Unpacking Nested Data Structures – Conclusion

In Python, you can unpack nested data structures in sophisticated ways, but the syntax might seem complicated. I hope that with this tutorial I’ve given you a clearer picture of how it all works. Here’s a quick recap of what we covered:

  • You just saw how Python’s “=” assignment operator iterates over complex data structures. You learned about the syntax of multiple assignments, recursive variable unpacking, and starred targets.

  • You also learned how Python’s for-statement unpacks data using the same rules as the = operator and worked through a number of examples.

It pays off to go back to the basics and to read the language reference closely—you might find some hidden gems there!

<strong><em>Improve Your Python</em></strong> with a fresh 🐍 <strong>Python Trick</strong> 💌 every couple of days

Improve Your Python with a fresh 🐍 Python Trick 💌 every couple of days

🔒 No spam ever. Unsubscribe any time.

This article was filed under: programming, and python.

Related Articles:
  • Catching bogus Python asserts on CI – It’s easy to accidentally write Python assert statements that always evaluate to true. Here’s how to avoid this mistake and catch bad assertions as part of your continuous integration build.
  • A Python Riddle: The Craziest Dict Expression in the West – Let’s pry apart this slightly unintuitive Python dictionary expression to find out what’s going on in the uncharted depths of the Python interpreter.
  • Comprehending Python’s Comprehensions – One of my favorite features in Python are list comprehensions. They can seem a bit arcane at first but when you break them down they are actually a very simple construct.
  • Assert Statements in Python – How to use assertions to help automatically detect errors in your Python programs in order to make them more reliable and easier to debug.
  • Context Managers and the “with” Statement in Python – The “with” statement in Python is regarded as an obscure feature by some. But when you peek behind the scenes of the underlying Context Manager protocol you’ll see there’s little “magic” involved.

About the Author

Marc Poulin

Marc started programming about 35 years ago and has had an eclectic career: data analysis, process control, C++, Oracle, stereolithography, signal processing, real-time Linux, and teaching. He is the author of the book Mastering Python Lists and an esteemed member of PythonistaCafe.

Latest Articles:
← Browse All Articles