The pickle module in the standard library provide tools for serializing arbitrary Python object into a series of bytes.

A serialized object can be stored in physical memory and then later reconstructed into the equivalent Python object. This can be used to achieve data persistence across different Python processes

To use the module, we first need to import it in our Python program:

Syntax:
import pickle
copy

Basic Usage

Basically, the pickle.dumps() function is the basic tool used for data serialization.  It serializes a Python object into the equivalent bytes sequence. It takes the object to be serialized as the argument.

Syntax:
pickle.dumps(obj)
copy
ExampleEdit & Run

serialize a dictionary  

import pickle

data = {'Tokyo': 'Japan', 'Moscow': 'Russia', 'Manilla': 'Philippines'} # a dictionary

serialized_data = pickle.dumps(data) #serialize the dictionary
print(serialized_data)
copy
Output:
b'\x80\x04\x95?\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x05Tokyo\x94\x8c\x05Japan\x94\x8c\x06Moscow\x94\x8c\x06Russia\x94\x8c\x07Manilla\x94\x8c\x0bPhilippines\x94u.' [Finished in 0.018821333535015583s]

To reconstruct/deserialize the serialized object into the equivalent Python object, we use the pickle.loads() function. The function takes the serialized data as the argument and then tries to reconstruct it into the equivalent object. It returns the reconstructed Python object.

Syntax:
pickle.loads(serialized_data)
copy
ExampleEdit & Run

Reconstruct a serialized object

import pickle

data = {'Tokyo': 'Japan', 'Moscow': 'Russia', 'Manilla': 'Philippines'} # a dictionary

serialized_data = pickle.dumps(data) #serialize the dictionary
print(serialized_data)

reconstructed_object = pickle.loads(serialized_data)
print(reconstructed_object)
copy
Output:
b'\x80\x04\x95?\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x05Tokyo\x94\x8c\x05Japan\x94\x8c\x06Moscow\x94\x8c\x06Russia\x94\x8c\x07Manilla\x94\x8c\x0bPhilippines\x94u.' {'Tokyo': 'Japan', 'Moscow': 'Russia', 'Manilla': 'Philippines'} [Finished in 0.019328678026795387s]

Note that the reconstructed object is equal to but not the same as the original object.

In the above examples, we used a dictionary, however it is worth noting that literally any type of Python object can be serialized and reconstructed. Consider the following example which uses custom objects.

ExampleEdit & Run
import pickle

class Point:
	def __init__(self, x = 0, y = 0):
		self.x = x 
		self.y = y 
	def __str__(self):
		return f'Point({self.x}, {self.y})'

p = Point(-2, 3)

#serialize the p2
serialized_p = pickle.dumps(p)
print(serialized_p)

#reconstruct 
reconstructed_p = pickle.loads(serialized_p)
print(reconstructed_p)
print(reconstructed_p.x)
print(reconstructed_p.y)
copy
Output:
b'\x80\x04\x95-\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x05Point\x94\x93\x94)\x81\x94}\x94(\x8c\x01x\x94J\xfe\xff\xff\xff\x8c\x01y\x94K\x03ub.' Point(-2, 3) -2 3

Data persistence

A serialized object can be written into a file for permanent storage. The data can then be loaded and used even after the current program instance has been closed. 

Basically, you can use the file methods like write() and read() to write and read serialized data to and from a file, however, as we will see in a while, the module provides more convenient functions for doing the same.

ExampleEdit & Run

write the serialized data to a file

import pickle

data = {'Tokyo': 'Japan', 'Moscow': 'Russia', 'Manilla': 'Philippines', 'Nairobi': 'Kenya'} # a dictionary

serialized_data = pickle.dumps(data) #serialize the dictionary
print(serialized_data)

#write serialized data to cities.dump
with open('cities.dump', 'wb') as file:
      file.write(serialized_data)
copy
Output:
b'\x80\x04\x95Q\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x05Tokyo\x94\x8c\x05Japan\x94\x8c\x06Moscow\x94\x8c\x06Russia\x94\x8c\x07Manilla\x94\x8c\x0bPhilippines\x94\x8c\x07Nairobi\x94\x8c\x05Kenya\x94u.'

The wb flag in the open() function indicates that the data should be written as binary data instead of mere strings.

We can then read from the file then call pickle.loads() to reconstruct the object.

ExampleEdit & Run

Read serialized object and reconstruct it

import pickle

#read data from cities.dump
with open('cities.dump', 'rb') as file:
    serialized_data = file.read()

    reconstructed_object = pickle.loads(serialized_data)
    print(reconstructed_object)
copy
Output:
{'Tokyo': 'Japan', 'Moscow': 'Russia', 'Manilla': 'Philippines', 'Nairobi': 'Kenya'}

Use dump() and  load() instead

As earlier mentioned, the pickle module offers functions to conveniently read and write data into a stream. The two functions are dump() and load().

dump() serializes an object and at the same time writes it into a file. Basically, It has the following syntax:

Syntax:
pickle.dump(obj, file)
copy

It serializes obj and writes it into file

ExampleEdit & Run

using dump()

import pickle

class Point:
	def __init__(self, x = 0, y = 0):
		self.x = x 
		self.y = y 
	def __str__(self):
		return f'Point({self.x}, {self.y})'

p1= Point(-2, 3)
p2 = Point(3, 4)
p3 = Point(5, 6)
points = [p1, p2, p3] # a list with points to serialize

with open('points.dump', 'wb') as file:
    pickle.dump(points, file)
copy

 

The load() method reads and reconstructs data from the given file. It has the following basic syntax:

Syntax:
pickle.load(file)
copy

It returns the reconstructed object.

ExampleEdit & Run
import pickle

class Point:
	def __init__(self, x = 0, y = 0):
		self.x = x 
		self.y = y 
	def __str__(self):
		return f'Point({self.x}, {self.y})'
	def __repr__(self):
		return f'Point({self.x}, {self.y}'

with open('points.dump', 'rb') as file:
    reconstructed_points = pickle.load(file)
    print(reconstructed_points)

p1, p2, p3 = reconstructed_points
print(p1)
print(p2)
print(p3)
copy
Output:
[Point(-2, 3), Point(3, 4), Point(5, 6)] Point(-2, 3) Point(3, 4) Point(5, 6) 

Unpicklable objects

Unpicklable objects are those that cannot be serialized using pickle module.

Generally, objects that depends on the operating system in use cannot be serialized. Such objects includes files, database connections, sockets, etc

Trying to serialize such objects using pickle will result in errors.

ExampleEdit & Run
import pickle

file = open('file.txt')

pickle.dumps(file)
copy
Output:
TypeError: cannot pickle 'TextIOWrapper' instances

If an object contains unpicklable attributes, we can define the __getstate__() and  __setstate__() methods to return only the attributes to be serialized.