Python Training by Dan Bader

Sets and Multisets in Python

How to implement mutable and immutable set and multiset (bag) data structures in Python using built-in data types and classes from the standard library.

A set is an unordered collection of objects that does not allow duplicate elements. Typically sets are used to quickly test a value for membership in the set, to insert or delete new values from a set, and to compute the union or intersection of two sets.

In a “proper” set implementation, membership tests are expected to run in O(1) time. Union, intersection, difference, and subset operations should take O(n) time on average. The set implementations included in Python’s standard library follow these performance characteristics.

Just like dictionaries, sets get special treatment in Python and have some syntactic sugar that makes it easier to create sets. For example, the curly-braces set expression syntax and set comprehensions allow you to conveniently define new set instances:

vowels = {'a', 'e', 'i', 'o', 'u'}
squares = {x * x for x in range(10)}

Careful: To create an empty set you’ll need to call the set() constructor, as using empty curly-braces ({}) is ambiguous and will create a dictionary instead.

Python and its standard library provide the following set implementations:

✅ The set Built-in

The built-in set implementation in Python. The set type in Python is mutable and allows the dynamic insertion and deletion of elements. Python’s sets are backed by the dict data type and share the same performance characteristics. Any hashable object can be stored in a set.

>>> vowels = {'a', 'e', 'i', 'o', 'u'}
>>> 'e' in vowels
True

>>> letters = set('alice')
>>> letters.intersection(vowels)
{'a', 'e', 'i'}

>>> vowels.add('x')
>>> vowels
{'i', 'a', 'u', 'o', 'x', 'e'}

>>> len(vowels)
6

✅ The frozenset Built-in

An immutable version of set that cannot be changed after it was constructed. Frozensets are static and only allow query operations on their elements (no inserts or deletions.) Because frozensets are static and hashable they can be used as dictionary keys or as elements of another set.

>>> vowels = frozenset({'a', 'e', 'i', 'o', 'u'})
>>> vowels.add('p')
AttributeError: "'frozenset' object has no attribute 'add'"

✅ The collections.Counter Class

The collections.Counter class in the Python standard library implements a multiset (or bag) type that allows elements in the set to have more than one occurrence.

This is useful if you need to keep track not only if an element is part of a set but also how many times it is included in the set.

>>> from collections import Counter
>>> inventory = Counter()

>>> loot = {'sword': 1, 'bread': 3}
>>> inventory.update(loot)
>>> inventory
Counter({'bread': 3, 'sword': 1})

>>> more_loot = {'sword': 1, 'apple': 1}
>>> inventory.update(more_loot)
>>> inventory
Counter({'bread': 3, 'sword': 2, 'apple': 1})

Careful with counting the number of elements in a Counter object. Calling len() returns the number of unique elements in the multiset, whereas the total number of elements must be retrieved slightly differently:

>>> len(inventory)
3  # Unique elements
>>> sum(inventory.values())
6  # Total no. of elements

📺🐍 Learn More With This Video Tutorial

I recorded a step-by-step video tutorial to go along with the article. See how sets work in general and how to use them in Python. Watch the video embedded below or on my YouTube channel:

» Subscribe to the dbader.org YouTube Channel for more Python tutorials.

Read the full “Fundamental Data Structures in Python” article series here. This article is missing something or you found an error? Help a brother out and leave a comment below.

<strong><em>Improve Your Python</em></strong> with a fresh 🐍 <strong>Python Trick</strong> 💌 every couple of days

Improve Your Python with a fresh 🐍 Python Trick 💌 every couple of days

🔒 No spam ever. Unsubscribe any time.

This article was filed under: data-structures, programming, and python.

Related Articles:
  • Stacks in Python – How to implement a stack data structure (LIFO) in Python using built-in types and classes from the standard library.
  • Fundamental Data Structures in Python – In this article series we’ll take a tour of some fundamental data structures and implementations of abstract data types (ADTs) available in Python’s standard library.
  • Priority Queues in Python – What are the various ways you can implement a priority queue in Python? Read on and find out what the Python standard library has to offer.
  • Dictionaries, Maps, and Hash Tables in Python – Need a dictionary, map, or hash table to implement an algorithm in your Python program? Read on to see how the Python standard library can help you.
  • Queues in Python – How to implement a FIFO queue data structure in Python using only built-in data types and classes from the standard library.
Latest Articles:
← Browse All Articles