An Introduction to Python Sets

Python supports sets which are a collection of unique elements and provide operations for computing set union, intersection and difference.

“The world is a book and those who do not travel read only one page.”
― Augustine of Hippo

1. Introduction

A set is a collection of unique elements. A common use is to eliminate duplicate elements from a list. In addition, it supports set operations like union intersection and difference.

2. Creating a Set

Brace Construction: Creating a set looks similar to creating a dictionary: you enclose a bunch of items within braces.

s = {1, 2, 3, 3, 4}
print s
# prints
set([1, 2, 3, 4])

Notice that the set contains unique elements only even though we put duplicates into it.

A set need not contain elements of the same type. You can mix and match element types as you like.

set = {1, 2, 'hello', 4, 'world'}
print set
# prints
set(['world', 2, 4, 'hello', 1])

Set Comprehension: Similar to dictionaries and lists, you can use set comprehension as in the following example of a set of squares.

a = {x*x for x in xrange(10)}
print a
# prints
set([0, 1, 4, 81, 64, 9, 16, 49, 25, 36])

3. Using the set() constructor

Create a set from a list using the set() constructor.

a = [1, 2, 2, 3]
print set(a)
# prints
set([1, 2, 3])

How about creating a set of characters comprising a string? This shortcut will work.

print set('abcd')
# prints
set(['a', 'c', 'b', 'd'])

Creating a set of unique random numbers.

a = [random.randint(0, 10) for x in xrange(10)]
print a
print set(a)
# prints
[10, 2, 3, 3, 6, 6, 4, 9, 5, 0]
set([0, 2, 3, 4, 5, 6, 9, 10])

4. Methods of set

The following sections explain the most commonly used methods of sets.

4.1. Membership Testing

The boolean expressions elem in a and elem not in a allow checking for membership of a set.

a = {'apple', 'orange', 'banana', 'melon', 'mango'}
print a
print 'banana' in a
print 'papaya' in a
# prints
set(['melon', 'orange', 'mango', 'banana', 'apple'])
True
False

4.2. Set Size

You can obtain the size of a set (the number of elements) using the len() function.

a = {'apple', 'orange', 'banana', 'melon', 'mango'}
print a
print 'size of a:', len(a)
# prints
set(['melon', 'orange', 'mango', 'banana', 'apple'])
size of a: 5

4.3. Adding Elements to a Set

Use the add() method to add an element to the set. If the element does not exist, it is added. No errors are raised if the element does exist, though.

a = [random.randint(0, 10) for x in xrange(10)]
print 'list =>', a
s = set(a)
print 'set =>', s
s.add(10)
print 'after add =>', s
# prints
list => [3, 4, 7, 2, 8, 0, 4, 1, 0, 4]
set => set([0, 1, 2, 3, 4, 7, 8])
after add => set([0, 1, 2, 3, 4, 7, 8, 10])

You will need to use a loop to add multiple elements since the add() method accepts only a single argument.

You cannot add a list to a set since the list cannot be hashed.

s.add(10)
print 'after add =>', s
s.add([21, 22])
print s
# prints
TypeErrorTraceback (most recent call last)
 in ()
      5 s.add(10)
      6 print 'after add =>', s
----> 7 s.add([21, 22])
      8 print s

TypeError: unhashable type: 'list'

However a tuple can be added since it is not mutable and hence hashable.

s.add((21, 22))
print s
# prints
set([0, 3, 4, 5, 6, 7, (21, 22), 9, 10])

4.4. Removing Elements from a Set

Remove a single element from a set using remove().

a = [random.randint(0, 10) for x in xrange(10)]
print 'list =>', a
s = set(a)
print 'set =>', s
s.remove(10)
print 'after remove =>', s
# prints
list => [6, 6, 7, 6, 7, 5, 10, 3, 8, 3]
set => set([3, 5, 6, 7, 8, 10])
after remove => set([3, 5, 6, 7, 8])

A KeyError is raised if the element is not in the set. (Running the same code as above a couple of times generates a random sequence without 10 in the set.)

# prints
list => [0, 4, 4, 4, 6, 6, 9, 5, 9, 6]
set => set([0, 9, 4, 5, 6])

KeyErrorTraceback (most recent call last)
 in ()
      3 s = set(a)
      4 print 'set =>', s
----> 5 s.remove(10)
      6 print 'after remove =>', s

KeyError: 10

Need to remove an element from a set without the pesky KeyError? Use discard().

print s
s.remove(0)
s.discard(20)
print s
# prints
set([0, 2, 3, 4, 8, 9])
set([2, 3, 4, 8, 9])

Remove all elements from a set? Use clear().

print s
s.clear()
print s
# prints
set([2, 3, 4, 8, 9])
set([])

5. Set Operations

Let us now learn about set operations supported by a set.

5.1. Disjoint Sets

A set is disjoint with another set if the two have no common elements. The method isdisjoint() returns True or False as appropriate.

print set([0, 3, 6]).isdisjoint(set([9, 10, 5, 7]))
# prints True

Another example:

print set([0, 1, 2, 3, 4]).isdisjoint(set([8, 1]))
# prints False

5.2 Checking for subset and superset.

Check whether a all elements of a set is contained in another set using the issubset() method. You can also use the boolean form setA <= setB.

Using the form setA < setB checks for setA being a proper subset of setB (that is setB containing all elements from `setA’ and then some more).

Need to check for a superset? Use issuperset() or setA >= setB or setA > setB for a proper superset.

a = set([1, 3, 4, 5])
b = set([1, 3, 4, 5])
c = set([1, 3, 4, 5, 6, 7])
print 'a = ', a
print 'b = ', b
print 'c = ', c
print 'a <= b', a <= b
print 'a < b', a < b
print 'issubset', a.issubset(b)
print 'a < c', a < c
# prints
a =  set([1, 3, 4, 5])
b =  set([1, 3, 4, 5])
c =  set([1, 3, 4, 5, 6, 7])
a <= b True
a < b False
issubset True
a < c True

5.3. Set Union

Compute the union of two or more sets using the union() method. A new set containing all elements of all sets is returned.

You can also use the pipe operator (|) as shown below.

a = set([1, 2, 3])
b = set([3, 4, 5, 6])
c = set(list('abcd'))
print a.union(b, c)
print a | b | c
# prints
set(['a', 1, 2, 3, 4, 5, 6, 'b', 'c', 'd'])
set(['a', 1, 2, 3, 4, 5, 6, 'b', 'c', 'd'])

5.4. Set Intersection

How about identifying elements common to two or more sets? Use the intersection() method or the & operator.

print a & b
print a & b & c
# prints
set([3])
set([])

5.5. Set Difference

Set difference returns a new set containing all elements in the argument set that are not in the other sets.

print a - b
print a - b - set([2])
# prints
set([1, 2])
set([1])

6. Iterating Over Sets

There are several ways of iterating over sets, most common ones are presented here.

  • A set is an iterable and hence can be used in a for loop for iterating over the elements.
    a = set([random.randint(0, 10) for _ in xrange(10)])
    print a
    for x in a:
      print x
    # prints
    set([0, 2, 5, 7, 8, 9])
    0
    2
    5
    7
    8
    9
    
  • The ever-present enumerate() function is available which returns a tuple of loop index and the element. Note that the loop index does not have any correlation to the set; in other words, a set does not have a concept of any ordering so the index is not an index into the set. It is just a loop counter.
    for i, v in enumerate(a):
      print i, v
    # prints
    0 1
    1 2
    2 3
    3 4
    4 5
    5 6
    6 8
    

Conclusion

And that’s it for now with sets. We learnt how to create sets using the brace notation as well as the set constructors. Next up were the various commonly used operations with sets.

Leave a Reply

Your email address will not be published. Required fields are marked *