Unpacking Nested Data Structures in Python
A tutorial on Python’s advanced data unpacking features: How to unpack data with the “=” operator and for-loops.
Have you ever seen Python’s enumerate
function being used like this?
for (i, value) in enumerate(values): ...
In Python, you can unpack nested data structures in sophisticated ways, but the syntax might seem complicated: Why does the for
statement have two variables in this example, and why are they written inside parentheses?
This article answers those questions and many more. I wrote it in two parts:
-
First, you’ll see how Python’s “
=
” assignment operator iterates over complex data structures. You’ll learn about the syntax of multiple assignments, recursive variable unpacking, and starred targets. -
Second, you’ll discover how the
for
-statement unpacks data using the same rules as the=
operator. Again, we’ll go over the syntax rules first and then dive into some hands-on examples.
Ready? Let’s start with a quick primer on the “BNF” syntax notation used in the Python language specification.
BNF Notation – A Primer for Pythonistas
This section is a bit technical, but it will help you understand the examples to come. The Python 2.7 Language Reference defines all the rules for the assignment statement using a modified form of Backus Naur notation.
The Language Reference explains how to read BNF notation. In short:
symbol_name ::=
starts the definition of a symbol( )
is used to group symbols*
means appearing zero or more times+
means appearing one or more times(a|b)
means eithera
orb
[ ]
means optional"text"
means the literal text. For example,","
means a literal comma character.
Here is the complete grammar for the assignment statement in Python 2.7. It looks a little complicated because Python allows many different forms of assignment:
An assignment statement consists of
- one or more
(target_list "=")
groups - followed by either an
expression_list
or ayield_expression
assignment_stmt ::= (target_list "=")+ (expression_list | yield_expression)
A target list consists of
- a target
- followed by zero or more
("," target)
groups - followed by an optional trailing comma
target_list ::= target ("," target)* [","]
Finally, a target consists of any of the following
- a variable name
- a nested target list enclosed in
( )
or[ ]
- a class or instance attribute
- a subscripted list or dictionary
- a list slice
target ::= identifier | "(" target_list ")" | "[" [target_list] "]" | attributeref | subscription | slicing
As you’ll see, this syntax allows you to take some clever shortcuts in your code. Let’s take a look at them now:
#1 – Unpacking and the “=” Assignment Operator
First, you’ll see how Python’s “=
” assignment operator iterates over complex data structures. You’ll learn about the syntax of multiple assignments, recursive variable unpacking, and starred targets.
Multiple Assignments in Python:
Multiple assignment is a shorthand way of assigning the same value to many variables. An assignment statement usually assigns one value to one variable:
x = 0 y = 0 z = 0
But in Python you can combine these three assignments into one expression:
x = y = z = 0
Recursive Variable Unpacking:
I’m sure you’ve written [ ]
and ( )
on the right side of an assignment statement to pack values into a data structure. But did you know that you can literally flip the script by writing [ ]
and ( )
on the left side?
Here’s an example:
[target, target, target, ...] = or (target, target, target, ...) =
Remember, the grammar rules allow [ ]
and ( )
characters as part of a target:
target ::= identifier | "(" target_list ")" | "[" [target_list] "]" | attributeref | subscription | slicing
Packing and unpacking are symmetrical and they can be nested to any level. Nested objects are unpacked recursively by iterating over the nested objects and assigning their values to the nested targets.
Here’s what this looks like in action:
(a, b) = (1, 2) # a == 1 # b == 2 (a, b) = ([1, 2], [3, 4]) # a == [1, 2] # b == [3, 4] (a, [b, c]) = (1, [2, 3]) # a == 1 # b == 2 # c == 3
Unpacking in Python is powerful and works with any iterable object. You can unpack:
- tuples
- lists
- dictionaries
- strings
- ranges
- generators
- comprehensions
- file handles.
Test Your Knowledge: Unpacking
What are the values of a
, x
, y
, and z
in the example below?
a = (x, y, z) = 1, 2, 3
Hint: this expression uses both multiple assignment and unpacking.
Starred Targets (Python 3.x Only):
In Python 2.x the number of targets and values must match. This code will produce an error:
x, y, z = 1, 2, 3, 4 # Too many values
Python 3.x introduced starred variables. Python first assigns values to the unstarred targets. After that, it forms a list of any remaining values and assigns it to the starred variable. This code does not produce an error:
x, *y, z = 1, 2, 3, 4 # y == [2,3]
Test Your Knowledge: Starred Variables
Is there any difference between the variables b
and *b
in these two statements? If so, what is it?
(a, b, c) = 1, 2, 3 (a, *b, c) = 1, 2, 3
#2 – Unpacking and for
-loops
Now that you know all about target list assignment, it’s time to look at unpacking used in conjunction with for
-loops.
In this section you’ll see how the for
-statement unpacks data using the same rules as the =
operator. Again, we’ll go over the syntax rules first and then we’ll look at a few hands-on examples.
Let’s examine the syntax of the for
statement in Python:
for_stmt ::= "for" target_list "in" expression_list ":" suite ["else" ":" suite]
Do the symbols target_list
and expression_list
look familiar? You saw them earlier in the syntax of the assignment statement.
This has massive implications:
Everything you’ve just learned about assignments and nested targets also applies to for loops!
Standard Rules for Assignments:
Let’s take another look at the standard rules for assignments in Python. The Python Language Reference says:
The
for
statement is used to iterate over the elements of a sequence (such as a string, tuple or list) or other iterable objects … Each item, in turn, is assigned to the target list using the standard rules for assignments.
You already know the standard rules for assignments. You learned them earlier when we talked about the =
operator. They are:
- assignment to a single target
- assignment to multiple targets
- assignment to a nested target list
- assignment to a starred variable (Python 3.x only)
In the introduction, I promised I would explain this code:
for (i,value) in enumerate(values): ...
Now you know enough to figure it out yourself:
- enumerate returns a sequence of
(number, item)
tuples - when Python sees the target list
(i,value)
it unpacks(number, item)
tuple into the target list.
Examples:
I’ll finish by showing you a few more examples that use Python’s unpacking features with for
-loops. Here’s some test data we’ll use in this section:
# Test data: negative_numbers = (-1, -2, -3, -4, -5) positive_numbers = (1, 2, 3, 4, 5)
The built-in zip
function returns pairs of numbers:
>>> list(zip(negative_numbers, positive_numbers)) [(-1, 1), (-2, 2), (-3, 3), (-4, 4), (-5, 5)]
I can loop over the pairs:
for z in zip(negative_numbers, positive_numbers): print(z)
Which produces this output:
(-1, 1) (-2, 2) (-3, 3) (-4, 4) (-5, 5)
I can also unpack the pairs if I wish:
>>> for (neg, pos) in zip(negative_numbers, positive_numbers): ... print(neg, pos) -1 1 -2 2 -3 3 -4 4 -5 5
What about starred variables? This example finds a string’s first and last character. The underscore character is often used in Python when we need a dummy placeholder variable:
>>> animals = [ ... 'bird', ... 'fish', ... 'elephant', ... ] >>> for (first_char, *_, last_char) in animals: ... print(first_char, last_char) b d f h e t
Unpacking Nested Data Structures – Conclusion
In Python, you can unpack nested data structures in sophisticated ways, but the syntax might seem complicated. I hope that with this tutorial I’ve given you a clearer picture of how it all works. Here’s a quick recap of what we covered:
-
You just saw how Python’s “
=
” assignment operator iterates over complex data structures. You learned about the syntax of multiple assignments, recursive variable unpacking, and starred targets. -
You also learned how Python’s
for
-statement unpacks data using the same rules as the=
operator and worked through a number of examples.
It pays off to go back to the basics and to read the language reference closely—you might find some hidden gems there!