The pickle
module in the standard library provide tools for serializing arbitrary Python object into a series of bytes.
A serialized object can be stored in physical memory and then later reconstructed into the equivalent Python object. This can be used to achieve data persistence across different Python processes
To use the module, we first need to import it in our Python program:
import pickle
copy
Basic Usage
Basically, the pickle.dumps()
function is the basic tool used for data serialization. It serializes a Python object into the equivalent bytes sequence. It takes the object to be serialized as the argument.
pickle.dumps(obj)
copy
To reconstruct/deserialize the serialized object into the equivalent Python object, we use the pickle.loads()
function. The function takes the serialized data as the argument and then tries to reconstruct it into the equivalent object. It returns the reconstructed Python object.
pickle.loads(serialized_data)
copy
Note that the reconstructed object is equal to but not the same as the original object.
In the above examples, we used a dictionary, however it is worth noting that literally any type of Python object can be serialized and reconstructed. Consider the following example which uses custom objects.
Data persistence
A serialized object can be written into a file for permanent storage. The data can then be loaded and used even after the current program instance has been closed.
Basically, you can use the file methods like write()
and read()
to write and read serialized data to and from a file, however, as we will see in a while, the module provides more convenient functions for doing the same.
The wb
flag in the open()
function indicates that the data should be written as binary data instead of mere strings.
We can then read from the file then call pickle.loads()
to reconstruct the object.
Use dump()
and load()
instead
As earlier mentioned, the pickle module offers functions to conveniently read and write data into a stream. The two functions are dump()
and load()
.
dump()
serializes an object and at the same time writes it into a file. Basically, It has the following syntax:
pickle.dump(obj, file)
copy
It serializes obj
and writes it into file
.
The load()
method reads and reconstructs data from the given file. It has the following basic syntax:
pickle.load(file)
copy
It returns the reconstructed object.
Unpicklable objects
Unpicklable objects are those that cannot be serialized using pickle
module.
Generally, objects that depends on the operating system in use cannot be serialized. Such objects includes files, database connections, sockets, etc
Trying to serialize such objects using pickle will result in errors.
If an object contains unpicklable attributes, we can define the __getstate__()
and __setstate__()
methods to return only the attributes to be serialized.