The itertools
module in the standard library provides a number of functions for working with and extending iterators. The main purpose of the various functions provided in the module is to allow for efficient looping over collections of data through iterators.
Iterators are typically constructed using the iter()
function and are used to loop over the items in an iterable, such as a list, tuple, dictionaries etc. Using iterators is generally more efficient than looping over the elements directly through the actual iterable. This is because iterators use lazy evaluation, which means they only move to the next element when requested. This improves performance by only fetching items as needed, rather than loading the entire iterable into memory at once.
To use the various itertools
functions, the module will need to be imported into our program.
use a function from the iterator module
#import the module
import itertools
def add(a, b):
return a + b
L = [1, 2, 3, 4, 5]
#use a function
for i in itertools.accumulate(L, add):
print(i)
Merge and split iterators
chain()
When looping over the elements of multiple iterables(such as a list), one might join the two iterables before looping over them.
L1 = [1, 2, 3, 4]
L2 = [10, 20, 30, 40]
for i in L1 + L2:
print(i)
The above approach is somehow inefficient since it leads to creation of a larger temporary list. The itertools.chain()
function can be used to implement the same logic more efficiently.
The chain()
function iterates over the elements of multiple iterables without having to join them first.
chain(*iterables)
from itertools import chain
L1 = [1, 2, 3, 4]
L2 = [10, 20, 30, 40]
for i in chain(L1, L2):
print(i)
islice()
The islice()
iterator accepts arguments for limiting the number of elements to be returned. It works like the builtin slice() function but it works with any iterable object rather than just lists and returns an iterator objects.
islice(iterable, stop, start = 0, step = 1)
The iterable
and the stop
arguments are required, if the start
is given all elements preceding it will be skipped over. step determines how many values will be skipped between successive calls.
import itertools
my_range = range(20)
islice_result = itertools.islice(my_range, 0, 15, 2)
for i in islice_result:
print(i)
tee()
The tee()
function creates multiple iterators from a single iterable object simultaneously. This is especially useful for applications where you need to make multiple passes over the same data, but you don't want to consume the data from the same iterator.
tee(iterable, n=2)
The function returns a tuple of n
iterators from the iterable. The iterators are independent and any changes made on one will not affect the other.
import itertools
data = ['Python', 'Java', 'PHP', 'C++', 'Ruby']
iter1, iter2, iter3 = itertools.tee(data, 3)
print(*iter1)
print(*iter2)
print(*iter3)
Transforming and creating iterator elements
starmap()
The starmap()
iterator
in the module applies a function to each element of an iterable object. The function is similar to the builtin map() except that it is capable of handling iterables of varying lengths lengths by using the unpacking operator(*
).
starmap(function, iterable)
from itertools import starmap
data = [(1, 2), (3, 4), (5, 6), (7, 8)]
for i in starmap(lambda a, b: f"{a} + {b} = {a + b}", data):
print(i)
count()
The count()
returns an endless iterator of integer numbers with a .
count(start=0, step=1)
from itertools import count
numbers = count(0, 100)
print(next(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))
cycle()
The cycle()
function creates iterator objects that iterates over the elements of an iterable object repeatedly and indefinitely. After the whole sequence is exhausted, the iterator will begin again from the beginning.
cycle(iterable)
from itertools import cycle
data = ('Python', 'Java', 'C++')
data_cycle = cycle(data)
print(next(data_cycle))
print(next(data_cycle))
print(next(data_cycle))
print(next(data_cycle))
print(next(data_cycle))
print(next(data_cycle))
print(next(data_cycle))
Note that the cycle object will have to keep track of the entire content of the iterable object, this may lead to more memory consumption in cases when the iterable object is large.
repeat()
The repeat()
function creates an iterator object iterator that produces the same value over and over again for a specified number of times.
repeat(obj, times = None)
If the times argument is not given, the object will be returned endlessly.
from itertools import repeat
obj = repeat("Pynerds", 5)
print(next(obj))
print(next(obj))
print(next(obj))
print(next(obj))
print(next(obj))
print(next(obj))
Filtering iterator elements
dropwhile()
The dropwhile()
function creates an iterator that returns elements from the specified iterable starting from the point when the given predicate function evaluates to False
.
dropwhile(predicate, iterable)
import itertools
data = range(10)
for i in itertools.dropwhile(lambda x: x < 5, data):
print(i)
takewhile()
The takewhile()
function is literally the opposite of the dropwhile()
. It creates an iterator object in which elements are produced as long as the given predicate function evaluates to True
.
takewhile(predicate, iterable)
import itertools
data = range(10)
for i in itertools.takewhile(lambda x: x < 5, data):
print(i)
filterfalse()
The filterfalse()
function is the opposite of the builtin filter() function . It filters elements from an iterable returning only those that evaluates to False
as per the given function.
filterfalse(func, iterable)
If the func
argument is not given, the elements will be filtered by their boolean value.
from itertools import filterfalse
def even(x):
return not x % 2
data = range(10)
#get the odd numbers
for i in filterfalse(even, data):
print(i)
Itertool functions cheat sheet
function | usage |
---|---|
chain(*iterables) |
Takes arbitrary iterable objects as input and returns a single iterator that iterates over the elements of all the iterables. |
compress(iterable,selectors) |
Returns an iterator containing only the elements from the iterable that are selected by the selectors. |
dropwhile(predicate,iterable) |
Returns elements from the specified iterable starting from the point when the given predicate function evaluates to False . |
filterfalse(func, iterable) |
Filters the elements of the iterator and returns only those that evaluates to False based on the key . It is literally the opposite of the filter() function |
groupby(iterable, key = None) |
Returns an iterator in which elements from the iterable are grouped based on the key function. If the key function is not specified , the element itself is used for grouping. |
islice(start, stop step) |
Creates an iterator containing integers in the range specified by start and stop, with step as the separating value. |
permutations(iterable, r = None) |
Returns an iterator that generates r possible permutations of a given iterable. If r is not given all possible permutations are returned. |
product(iterable, repeat = 1) |
Generates the Cartesian product of an iterable, useful for building a set of all possible combinations. |
repeat(obj, times = None) |
An iterator that generates the given object over and over for the specified number of times. If times argument is not specified, returns the object endlessly. |
starmap(func, iterable) |
Applies a function to elements of the iterable. |
takewhile(predicate, iterable) |
Returns elements from the specified iterable as long as the given predicate function evaluates to True |
zp-longest(*iterables, fill_value= None) |
Creates an iterator by combining multiple iterators element-wise, up to the length of the longest iterator, fills in missing values with a default element. |