“Laughter is poison to fear.” ― George R.R. Martin, A Game of Thrones
Python provides a module called itertools which, as the name suggests, provides a bunch of conveniences for dealing with iterations and looping. While you could spend your entire python career without ever having to touch this module, trust me when I say your life will be enriched if you at least know about what is available in itertools.
Have you ever wanted to start a count and keep counting without an end point in sight? If so, the count() function is for you. Here is how you can use it. It accepts an argument to start counting from, and an optional increment. The following loop counts starting from 10 with an increment of 20: 10, 20, 30, and so on.
Note that since there is no end condition for the loop, it basically runs till the end of time (or integer overflow!).
for x in itertools.count(10, 20): print x
Let us add a condition to stop it somewhere. The following counts till the counter crosses 1000.
for x in itertools.count(10, 20): print x if x > 1000: break
Instead of just counting, what about repeating? We have you covered with the cycle() function.
A simple example, cycle over some letters:
for x in itertools.cycle('ABCD'): print x, # prints A B C D A B C D A B C D A B C ...
The argument can be a list too:
for x in itertools.cycle([1, 2, 4, 8]): print x, # prints 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 ...
The object returned from the cycle() method is a generator object, so you can get the next element in the cycle using the next() method.
g = itertools.cycle([1, 2, 4, 8]) print g.next() print g.next() print g.next() print g.next() print g.next() print g.next() print g.next() # prints 1 2 4 8 1 2 4
Instead of cycling through a list or something, what if you want to repeat the same thing over and over again? In this case, you can use the repeat() function.
for x in itertools.repeat('abc', 10): print x, # prints abc abc abc abc abc abc abc abc abc abc
You might be thinking, “Big Freaking Deal! I can do that without a fancy function.”
print 'abc ' * 10
However, the difference is that ‘abc ‘ * 10 creates copies of the repeating object. Imagine if you were to do this 100,000 times. Your computer might start having memory problems.
On the other hand, repeat() returns a generator whose next() method returns the next value. So no data storage involved.
The next stop on the itertools tour is the chain() function. It accepts two or more iterables (such as lists, iterators, etc.) and returns elements one after another from them. A sort of a cat or concat function.
for x in itertools.chain('abc', '1234'): print x, # prints a b c 1 2 3 4
4.1. Walking a Directory Hierarchy
Here is a useful example of using chain – To process entries in multiple directories using os.walk().
As you might know, os.walk() accepts a directory and returns a tuples of (dirname, subdirs, files) within each directory in the hierarchy.
Consider for example the following hierarchy. We have a directory A with file entries a, b and c and sub-directories B and C. And so on.
$ ls -R A A: a b B/ c C/ A/B: ba bb A/C: ca
The following code will walk through this hierarchy.
for x in os.walk('A'): print x # prints ('A', ['B', 'C'], ['b', 'a', 'c']) ('A/B', , ['ba', 'bb']) ('A/C', , ['ca'])
That is all well and good, but what if you want to process multiple root directories this way? You have chain() to the rescue!
for x in itertools.chain(os.walk('A'), os.walk('X')): print x # prints ('A', ['B', 'C'], ['b', 'a', 'c']) ('A/B', , ['ba', 'bb']) ('A/C', , ['ca']) ('X', ['Z', 'Y'], ['y', 'z']) ('X/Z', , ['zx', 'zy']) ('X/Y', , ['yy', 'yz', 'yx'])
That offers a pretty easy way of looking through multiple directories for, say, a config file.
The python itertools package offers a bunch of easy-to-use functions for common iteration tasks. Most of these return a generator which allow you to play creatively with a iterables. We have covered count, cycle and chain in this article.