The json module in the standard library provide utilities for working with JSON files.

JSON, short for JavaScript Object Notation, is a standard file format that is very common for interchange of data in plain text across various systems. It is a lightweight and language-independent data interchange format that is easy for both humans and machines to understand.

JSON types are closely related to Python types. These makes it easy to convert between the two representations. For example the syntax of JSON objects is pretty much like that of dictionaries, while JSON arrays are just like Python lists.

A typical JSON object looks as follows:

.format of json data

A value in a Json object can support any of the valid  types including arrays and other objects, a key, on the other hand, must strictly be a string, numberboolean, or null

Basic encoding and decoding 

Encoding refers to converting a python object into a JSON object, while decoding refers to converting a JSON object back into a python object.

JSON inherently supports the basic types such as numbers, strings, arrays and objects which are similar to dictionaries. Thus a Python object will be interpreted into the most relevant type, for example, a list or a tuple will be encoded as an array, and an integer or float will both be encoded into a number. 

The json module provides two essential functions for encoding and decoding operations.

The json.dumps() function.

The dumps() function takes a Python object such as a dictionary, list, integer, e.tc and converts it into a JSON string.

import json

data = {'one': 1, 'two': 2, 'three': 3}

json_string = json.dumps(data)
print(json_string)

In the above example, we serialized a dictionary whose keys are strings and values are integers into a Json string.

The encoder requires that a dictionary key be a string, number, boolean or None. Any other type if used as key will result in an error. 

tuples as keys causes an error 

import json

data = {(1, 1.0): 'one', (2, 2.0): 'two'} #tuples as keys

json_string = json.dumps(data)
print(json_string)

Encoding tuples and lists

When a value is either a tuple or a list, it is encoded into a Json array. Arrays looks just like Python lists.

encode tuples and lists 

import json

data = {
   'one': (1, 1.0, '1', '1.0'),
   'two': [2, 2.0, '2', '2.0'],
   'three': [3, 3.0, '3', '3.0'],
   'four': (4, 4.0, '4', '4.0')
  }

json_string = json.dumps(data)
print(json_string)

As you can see both tuples and lists are encoded as arrays with the same syntax as that of lists in Python.

Nested objects

A dictionary is encoded as a Json object as we have already seen. Json objects supports nesting, this means that an object can appear inside of another objects. This relationship is established when we encode a nested dictionary.

import json

data = {
   'one': {'integer': 1, 'float': 1.0, 'string': ('1', '1.0')},
   'three': {'integer': 3, 'float': 3.0, 'string':['3', '2.0']},
  }

json_string = json.dumps(data)
print(json_string)

In the above example, we have a dictionary in which the values are also dictionaries, the dictionary is encoded into a nested Json object.

An array of objects

When a list or a tuple, is made up of dictionaries, it will be encoded into a Json array with the dictionaries as objects in the array. 


import json

data = (
   {'name': 'one', 'integer': 1, 'float': 1.0, 'string': ('1', '1.0')},
   {'name': 'three', 'integer': 3, 'float': 3.0, 'string':['3', '2.0']},
  )#a tuple of dictionaries.

json_string = json.dumps(data)
print(json_string)

In the above example, we used a tuple of dictionary to create an array of objects in Json, the same results will as well be achieved with a list of dictionaries.

The json.loads() function

As we have seen, the json.dumps() function encodes/converts a Python object into a Json string. json.loads(), on the other hand, converts a Json string into the equivalent Python object.


import json

json_string = '{"one": 1, "two": 2, "three": 3}'

obj = json.loads(json_string)
print(obj)
print(type(obj))

In the above example, we gave a Json string to the loads() function at it correctly translated it into the equivalent dictionary.

In the following example, we convert a Json array of objects into a Python list of dictionaries .

import json

json_string = '[{"name": "one", "integer": 1, "float": 1.0}, {"name": "three", "integer": 3, "float": 3.0}]'

obj = json.loads(json_string)
print(obj)
print(type(obj))

Note that a Python object converted into a Json string and then into a Python object again may not always be exactly the same. This is especially because some objects such as tuples and lists are interpreted as same in Json. This for example means that an object that was originally a tuple will be interpreted as a list in subsequent decoding.

import json

data = (
   (0, 1),
   (2, 3),
   (4, 5)
  )#a tuple of tuples.

json_string = json.dumps(data) #convert data to json
obj = json.loads(json_string) # convert back to Python
print(obj)

As you can see above, the tuple has been reconstructed as a list.

Working with custom and unsupported types

So far we have been using builtin types that are supported natively by Json. However, some other objects including sets and user defined objects are not supported by default, we will therefore need to first convert them to supported types before encoding them to Json.

set raises an error 

import json

data = {1, 2, 3, 4, 5} #a set

json.dumps(data)

As you can see above, trying to encode a set leads to an exception because sets are not Json serializable. 

The json.dumps() function takes an argument called default which allows us to specify a function that will be used to convert unsupported types into a supported type before it is serialized. For example in case with set, we can specify the builtin function list as the default, so that sets will be converted into lists which are serializable.

import json

data = {1, 2, 3, 4, 5} # a set

json_string = json.dumps(data, default = list)

print(json_string)

In the above example, we specified builtin list function as the default argument in json.dumps(). This means that the function will first try to convert any unsupported type into a list before it raises an error.

With custom objects

When working with custom classes and objects, we can implement a converter function that returns the attributes that identify the objects. This makes it possible to recreate the object from Json back to a Python object.

We can define the function to convert an object to a dictionary as shown below; 

import json

#a function to convert an object to a dict
def obj_to_dict(obj):
   d = {'__class__': obj.__class__.__name__,
        '__module__': obj.__module__
   }
   d.update(obj.__dict__)
   return d

#an example class
class Point:
   def __init__(self, x, y):
      self.x = x 
      self.y = y 
   def __str__(self):
      return f"Point({x}, {y})"
   def __repr__(self):
      return f"<Point({x}, {y})>"

p = Point(2, 3)
print(obj_to_dict(p))

In the above example:

  • We defined the obj_to_dict() function.
  • The function takes a Python object as an argument.
  • It creates a dictionary whose items are attributes that identify the object.
  • The __class__ attribute stores the name of the class the object belongs i.e obj.__class__.__name__
  • The __module__ attribute stores the name of the module.
  • We then updates the dictionary with the object's namespace dictionary i.e __dict__.
  • __dict__ stores the variables that are  associated with an object and their values.

We can now pass the obj_to_dict() function as the default argument for the json.dumps() function when converting Point and other custom objects into Json.

import json

p = Point(2, 3)

json_string = json.dumps(p, default = obj_to_dict)
print(json_string)

{"__class__": "Point", "__module__": "__main__", "x": 2, "y": 3} 

Converting from Json to custom object

The json.loads() function takes an argument called object_hook which takes a function as an argument. If given, the function gets called with the decoded object first before it is returned. We can use this argument to convert the Json objects back to their original custom objects in Python.

Let us first define the function to covert from dictionary to object.

class Point:
   def __init__(self, x, y):
      self.x = x 
      self.y = y 
   def __str__(self):
      return f"Point({self.x}, {self.y})"
   def __repr__(self):
      return f"<Point({self.x}, {self.y})>"

def dict_to_obj(d):
  if '__class__' in d:
     class_name = d.pop('__class__')
     module_name = d.pop('__module__')
     
     module = __import__(module_name)
     _class = getattr(module, class_name)
     
     return _class(**d)

d = {"__class__": "Point", "__module__": __name__, "x": 2, "y": 3}
print(dict_to_obj(d))

Point(2, 3)

The dict_to obj() function above takes a dictionary and then reconstructs the original object using the dictionary items. We can now pass this function to json.loads() as the object_hook argument.

#definitions are ommitted refer to previous sections

import json

p = Point(4, 5)
print('Original: ', p)

json_string = json.dumps(p, default = obj_to_dict)
print('Json: ', json_string)

reconstructed = json.loads(json_string, object_hook = dict_to_obj)
print('Reconstructed: ', reconstructed)

Original:  Point(4, 5)
Json:  {"__class__": "Point", "__module__": "__main__", "x": 4, "y": 5}
Reconstructed:  Point(4, 5)  

Working with streams/files

When we are working with large sets of data, we may need to write and read Json to and  from streams and files.

The json module provides two helper functions for these purposes i.e json.dump() and json.load(). These functions allows us to write directly to files and file-like objects.

writing to a stream

The json.dump() function takes an object and a stream/file as arguments. It converts the object to json and writes it to the stream.

The following is the basic syntax:

json.dump(obj, file)

In the above syntax, we have only shown the required arguments. Note that dump() is just like dumps() in that it takes all argument that dumps() takes in addition to file.

Where file can be an opened file object or any other file-like object such as StringIO, BytesIO, etc.

import json,io

student1 = {'name': 'John', 'age': 19, 'School': 'Reeds'}
student2 =  {'name': 'Keziah', 'age': 20, 'School': 'Harvard'}
student3 =  {'name': 'Lucy', 'age': 19, 'School': 'Edinburgh'}
student4 =  {'name': 'Mike', 'age': 21, 'School': 'Reeds'}

students = [student1, student2, student3, student4]

with io.StringIO() as file:
   json.dump(students, file)
   print(file.getvalue())

In the above example, we used a stringIO object, but you can use any file-like object such as one opened using open()  function as well.

Read Json data from a file

The load() function takes an opened file object as a required argument, it retrieves the data and converts it back to Python object.

import json, io

json_string = '''[{"name": "John", "age": 19, "School": "Reeds"}, {"name": "Keziah", "age": 20, "School": "Harvard"}, {"name": "Lucy", "age": 19, "School": "Edinburgh"}, {"name": "Mike", "age": 21, "School": "Reeds"}]'''

with io.StringIO(json_string) as file:
   students = json.load(file)
   print(students)
   print(type(students))