“The world is a book and those who do not travel read only one page.”
― Augustine of Hippo
Contents
1. Introduction
A set is a collection of unique elements. A common use is to eliminate duplicate elements from a list. In addition, it supports set operations like union intersection and difference.
2. Creating a Set
Brace Construction: Creating a set looks similar to creating a dictionary: you enclose a bunch of items within braces.
s = {1, 2, 3, 3, 4} print s # prints set([1, 2, 3, 4])
Notice that the set
contains unique elements only even though we put duplicates into it.
A set need not contain elements of the same type. You can mix and match element types as you like.
set = {1, 2, 'hello', 4, 'world'} print set # prints set(['world', 2, 4, 'hello', 1])
Set Comprehension: Similar to dictionaries and lists, you can use set comprehension as in the following example of a set of squares.
a = {x*x for x in xrange(10)} print a # prints set([0, 1, 4, 81, 64, 9, 16, 49, 25, 36])
3. Using the set()
constructor
Create a set from a list using the set()
constructor.
a = [1, 2, 2, 3] print set(a) # prints set([1, 2, 3])
How about creating a set of characters comprising a string? This shortcut will work.
print set('abcd') # prints set(['a', 'c', 'b', 'd'])
Creating a set of unique random numbers.
a = [random.randint(0, 10) for x in xrange(10)] print a print set(a) # prints [10, 2, 3, 3, 6, 6, 4, 9, 5, 0] set([0, 2, 3, 4, 5, 6, 9, 10])
4. Methods of set
The following sections explain the most commonly used methods of sets.
4.1. Membership Testing
The boolean expressions elem in a
and elem not in a
allow checking for membership of a set.
a = {'apple', 'orange', 'banana', 'melon', 'mango'} print a print 'banana' in a print 'papaya' in a # prints set(['melon', 'orange', 'mango', 'banana', 'apple']) True False
4.2. Set Size
You can obtain the size of a set (the number of elements) using the len()
function.
a = {'apple', 'orange', 'banana', 'melon', 'mango'} print a print 'size of a:', len(a) # prints set(['melon', 'orange', 'mango', 'banana', 'apple']) size of a: 5
4.3. Adding Elements to a Set
Use the add()
method to add an element to the set. If the element does not exist, it is added. No errors are raised if the element does exist, though.
a = [random.randint(0, 10) for x in xrange(10)] print 'list =>', a s = set(a) print 'set =>', s s.add(10) print 'after add =>', s # prints list => [3, 4, 7, 2, 8, 0, 4, 1, 0, 4] set => set([0, 1, 2, 3, 4, 7, 8]) after add => set([0, 1, 2, 3, 4, 7, 8, 10])
You will need to use a loop to add multiple elements since the add()
method accepts only a single argument.
You cannot add a list
to a set
since the list cannot be hashed.
s.add(10) print 'after add =>', s s.add([21, 22]) print s # prints TypeErrorTraceback (most recent call last) in () 5 s.add(10) 6 print 'after add =>', s ----> 7 s.add([21, 22]) 8 print s TypeError: unhashable type: 'list'
However a tuple
can be added since it is not mutable and hence hashable.
s.add((21, 22)) print s # prints set([0, 3, 4, 5, 6, 7, (21, 22), 9, 10])
4.4. Removing Elements from a Set
Remove a single element from a set using remove()
.
a = [random.randint(0, 10) for x in xrange(10)] print 'list =>', a s = set(a) print 'set =>', s s.remove(10) print 'after remove =>', s # prints list => [6, 6, 7, 6, 7, 5, 10, 3, 8, 3] set => set([3, 5, 6, 7, 8, 10]) after remove => set([3, 5, 6, 7, 8])
A KeyError
is raised if the element is not in the set. (Running the same code as above a couple of times generates a random sequence without 10
in the set.)
# prints list => [0, 4, 4, 4, 6, 6, 9, 5, 9, 6] set => set([0, 9, 4, 5, 6]) KeyErrorTraceback (most recent call last) in () 3 s = set(a) 4 print 'set =>', s ----> 5 s.remove(10) 6 print 'after remove =>', s KeyError: 10
Need to remove an element from a set without the pesky KeyError
? Use discard()
.
print s s.remove(0) s.discard(20) print s # prints set([0, 2, 3, 4, 8, 9]) set([2, 3, 4, 8, 9])
Remove all elements from a set? Use clear()
.
print s s.clear() print s # prints set([2, 3, 4, 8, 9]) set([])
5. Set Operations
Let us now learn about set operations supported by a set
.
5.1. Disjoint Sets
A set is disjoint with another set if the two have no common elements. The method isdisjoint()
returns True
or False
as appropriate.
print set([0, 3, 6]).isdisjoint(set([9, 10, 5, 7])) # prints True
Another example:
print set([0, 1, 2, 3, 4]).isdisjoint(set([8, 1])) # prints False
5.2 Checking for subset and superset.
Check whether a all elements of a set is contained in another set using the issubset()
method. You can also use the boolean form setA <= setB
.
Using the form setA < setB
checks for setA
being a proper subset of setB
(that is setB
containing all elements from `setA’ and then some more).
Need to check for a superset? Use issuperset()
or setA >= setB
or setA > setB
for a proper superset.
a = set([1, 3, 4, 5]) b = set([1, 3, 4, 5]) c = set([1, 3, 4, 5, 6, 7]) print 'a = ', a print 'b = ', b print 'c = ', c print 'a <= b', a <= b print 'a < b', a < b print 'issubset', a.issubset(b) print 'a < c', a < c # prints a = set([1, 3, 4, 5]) b = set([1, 3, 4, 5]) c = set([1, 3, 4, 5, 6, 7]) a <= b True a < b False issubset True a < c True
5.3. Set Union
Compute the union of two or more sets using the union()
method. A new set containing all elements of all sets is returned.
You can also use the pipe operator (|
) as shown below.
a = set([1, 2, 3]) b = set([3, 4, 5, 6]) c = set(list('abcd')) print a.union(b, c) print a | b | c # prints set(['a', 1, 2, 3, 4, 5, 6, 'b', 'c', 'd']) set(['a', 1, 2, 3, 4, 5, 6, 'b', 'c', 'd'])
5.4. Set Intersection
How about identifying elements common to two or more sets? Use the intersection()
method or the &
operator.
print a & b print a & b & c # prints set([3]) set([])
5.5. Set Difference
Set difference returns a new set containing all elements in the argument set that are not in the other sets.
print a - b print a - b - set([2]) # prints set([1, 2]) set([1])
6. Iterating Over Sets
There are several ways of iterating over sets, most common ones are presented here.
- A set is an iterable and hence can be used in a
for
loop for iterating over the elements.a = set([random.randint(0, 10) for _ in xrange(10)]) print a for x in a: print x # prints set([0, 2, 5, 7, 8, 9]) 0 2 5 7 8 9
- The ever-present
enumerate()
function is available which returns a tuple of loop index and the element. Note that the loop index does not have any correlation to the set; in other words, a set does not have a concept of any ordering so the index is not an index into the set. It is just a loop counter.for i, v in enumerate(a): print i, v # prints 0 1 1 2 2 3 3 4 4 5 5 6 6 8
Conclusion
And that’s it for now with sets. We learnt how to create sets using the brace notation as well as the set
constructors. Next up were the various commonly used operations with sets.