ExampleEdit & Run

counting element occurrences

#import the Counter class from collections module
from collections import Counter

#An iterable with elements to count
data = 'aabbbccccdeefff'

#create the Counter object
c = Counter(data)

print(c)

#get the count of a specific element
print(c['f'])
Output:
Counter({'c': 4, 'b': 3, 'f': 3, 'a': 2, 'e': 2, 'd': 1})3[Finished in 0.012258559931069613s]

 

The collections module in the standard library provide the  Counter class that is a container which stores elements as dictionary keys, and their counts as dictionary values. The class is a subclass of the dict class and its objects shares a lot of functionality with the standard dictionaries. 

ExampleEdit & Run
from collections import Counter

print(issubclass(Counter, dict))
Output:
True[Finished in 0.012076932936906815s]

Counter objects are typically used in situations where you need to keep track of how many of each element appears in an iterable object such as a list, set, tuple, e.t.c  This can be the case for example when  performing analysis on a dataset.

Instantiating Counter objects

Counter objects can be initialized by either passing in a sequence of data such as a list or tuple, or by manually setting the initial values. 

Syntax:
Counter(iterable = None, **kwargs)

The class only works with hashable elements, if the iterable contains any unhashable element, a  TypeError will be raised. 

ExampleEdit & Run

Instantiate a Counter object from an iterable object.

from collections import Counter

data = ['Java', 'Python', 'Java', 'C++', 'Python', 'PHP', 'C++', 'Java', 'PHP', 'Java', 'Python']

c = Counter(data)

print(c)
Output:
Counter({'Java': 4, 'Python': 3, 'C++': 2, 'PHP': 2})[Finished in 0.01152055012062192s]

Alternatively we can populate the initial values at instantiation as shown below.

ExampleEdit & Run
from collections import Counter

c = Counter({'Python': 10, 'Java': 8, 'C++': 12, 'PHP': 9})

print(c)
Output:
Counter({'C++': 12, 'Python': 10, 'PHP': 9, 'Java': 8})[Finished in 0.01084945211187005s]
Updating the dataset

The update method  changes the dataset after the counter have been instantiated

ExampleEdit & Run

Update  the sequence after instantiation

from collections import Counter

c = Counter()

data = ['Python', 'C++', 'Java', 'Python', 'C++', 'Python']
#update the counter
c.update(data)

print(c)
Output:
Counter({'Python': 3, 'C++': 2, 'Java': 1})[Finished in 0.010674295015633106s]

Counter methods

Since the Counter class is a subclass of the dict class, it inherits  all of the methods of the dict class. It also defines other useful methods which includes:  

method description
most_common(n) Returns a list of tuples showing the n most commonly encountered elements and their respective counts.
elements() Returns an iterator over the elements of the
subtract(iterable = None, **kwargs) Subtracts counts of elements in the given iterable from the current counts.
update(iterable = None, **kwargs) Updates the Counter with the elements from another iterable.

Get the n most common elements

The most_common() method returns the most common elements as per the given integer argument. It returns a list  containing tuples of the form (item, frequency) in descending order of frequency. 

Syntax:
most_common(n)
ExampleEdit & Run
from collections import Counter

data = ['Java', 'Python', 'Java', 'C++', 'Python', 'PHP', 'C++', 'Java', 'PHP', 'Java', 'Python']

c = Counter(data)

print(c.most_common(3))
Output:
[('Java', 4), ('Python', 3), ('C++', 2)][Finished in 0.010987699031829834s]

In the above example, the most_common() method is used to get the 3 most common elements from the list. 

Accessing Counts

Since Counter  objects are a dictionaries we can use the dictionary way of accessing elements to access the count of a particular element,  i.e  counter['x'].

ExampleEdit & Run
from collections import Counter

data = ['Java', 'Python', 'Java', 'C++', 'Python', 'PHP', 'C++', 'Java', 'PHP', 'Java', 'Python']

c = Counter(data)

print(c['Java'])
Output:
4[Finished in 0.01071302779018879s]

We can also use the get() method for the same purpose.

ExampleEdit & Run
from collections import Counter

data = ['Java', 'Python', 'Java', 'C++', 'Python', 'PHP', 'C++', 'Java', 'PHP', 'Java', 'Python']

c = Counter(data)

print(c.get('Python'))
Output:
3[Finished in 0.011251745047047734s]

Operations on Counter objects

Counter objects support various arithmetic as well as set operations. Some of these operations are.

operation Description
c1 + c2 You can add two Counter objects by using the + operator, resulting in a new Counter with the sum of the count of each element.
c1 - c2 You can subtract two Counter objects by using the - operator. This will remove all elements found in c2 from c1.
c1 & c2 The union of two Counters  results in another Counter object with all elements of both c1 and c2, with the value/ count of each element being the sum ots count in c1 and c2.
c1 | c2 Intersecting two Counters keeps only the elements that appear in both Counters. The resulting Counter will be the minimum of the two Counters.
ExampleEdit & Run

Addition

from collections import Counter

data1 = ['Java', 'Python', 'Java', 'C++', 'Python', 'PHP', 'C++', 'Java', 'PHP', 'Java', 'Python']
data2 = ['PHP', 'Python', 'Java', 'Python']

c1 = Counter(data1)
c2 = Counter(data2)

c3 = c1 + c2 
print(c3)
Output:
Counter({'Java': 5, 'Python': 5, 'PHP': 3, 'C++': 2})[Finished in 0.010944830952212214s]
ExampleEdit & Run

Intersection

from collections import Counter

data1 = ['Java', 'Python', 'Java', 'C++', 'Python', 'PHP', 'C++', 'Java', 'PHP', 'Java', 'Python']
data2 = ['PHP', 'Python', 'Java', 'Python']

c1 = Counter(data1)
c2 = Counter(data2)

c3 = c1 | c2 
print(c3)
Output:
Counter({'Java': 4, 'Python': 3, 'C++': 2, 'PHP': 2})[Finished in 0.012572660809382796s]