JSON (JavaScript Object Notation) is a standard file format that is designed for human-readable as well as machine-readable data interchange. It  provides a lightweight and readable way to represent structured data.

The JSON format  is a universal format and can be understood by many different programming languages.  By convention, a file with the extension ".json" is assumed to be a JSON file.

Data in JSON format is represented as objects, which are enclosed in curly braces {}. Each object consists of one or more key-value pairs, where the key is a string enclosed in quotes and the value can be of various types, including strings, numbers, booleans, arrays, or even nested objects.

The syntax of a JSON object is typically similar to that of a Python dictionary. This makes it easy to convert between the two formats.   

Example of a JSON object:

{
  "first_name": "Ford",
  "last_name": "Prefect",
  "age": 200,
  "country": "Betelgeuse V",
  "friends":["Marvin", "Arthur Dent","Zaphod Beeblebrox" ]
}

In Python, the json module in the standards library provides functions and classes for loading, writing, and manipulating JSON data from and into json files.

When you convert from Python to JSON, Python objects are converted into the JSON (JavaScript) equivalent:

Python JSON
dict Object
list Array
tuple Array
str String
int Number
float Number
True true
False false
None null

Converting From Python objects to  JSON.

The json module in python provides the dumps() function which serializes a python object into a JSON-formatted string.

Syntax of the json.dumps() function:

json.dumps(object)

It takes in a Python object such as a list, dictionary, etc. and converts it into a JSON string.

ExampleEdit & Run
import json

print( json.dumps([1, 2, 3, 4]) )

print( json.dumps({1:'One', 2:'Two', 3:'Three', 4:'Four'}) )
Output:
[1, 2, 3, 4]{"1": "One", "2": "Two", "3": "Three", "4": "Four"}[Finished in 0.018388555850833654s]

The returned string can then be be written to a .json file.

ExampleEdit & Run
import json

data = { 'name': 'John Doe', 'age': 42, 'profession': 'Software Developer' }
json_string = json.dumps(data)
with open('data.json', 'w') as file:
    file.write(json_string)
    file.write('\n')

You can open the file in an editor to view the written data.

Alternatively, we can use the json.dump() function to serialize an object and write it to a file simultaneously. The json.dump() takes a second argument i.e the file to write the object(s) to.

Syntax:

json.dump(obj, file)

The obj argument represents the objects to serialize while the file argument represents the  file object to write to. The function takes other arguments, you can check on json.dump() function to see all arguments.

The object is first encoded as a JSON string and then written to the file-like object.

ExampleEdit & Run
import json
data = {'country': 'India',
         'capital': 'New Delhi',
         'president': 'Droupadi Murmu',
         'continent': 'Asia',
         'currency': 'Rupee'
}
with open('data.json', 'w') as file:
     json.dump(data, file)
     file.write('\n')

The 'w' file opening mode overwrites the existing data in a file, we can use the append mode( "a") to append to the end of the file without overwriting the data in the file.

ExampleEdit & Run
import json
data = {'country': 'Rwanda',
         'capital': 'Kigali',
         'president': 'Paul Kagame',
         'continent': 'Africa',
         'currency': 'Franc'
}
with open('data.json', 'a') as file:
     json.dump(data, file)
     file.write('\n')

It is also possible to serialize nested objects as shown below.

import json

data = {
    'Europe':{
      'France':{
         'capital':'Paris',
         'currency':'Euro'},
      'Russia':{
          'capital':'Moscow',
          'currency': 'Rubble'},
    },
    'Asia':{
       'China':{
           'capital':'Beijing',
           'currency':'Yuan'},
       'India':{
           'capital': 'New Delhi',
           'currency': 'Rupee'},
   },
   'Africa':{
      'Kenya':{
          'capital': 'Nairobi',
           'currency': 'Shilling'},
      'Rwanda':{
          'capital':'Kigali',
          'currency': 'Franc'},
   },
}

print(json.dumps(data))

with open('data.json', 'w') as file:
      json.dump(data, file)
      file.write('\n')
Output:

 {"Europe": {"France": {"capital": "Paris", "currency": "Euro"}, "Russia": {"capital": "Moscow", "currency": "Rubble"}}, "Asia": {"China": {"capital": "Beijing", "currency": "Yuan"}, "India": {"capital": "New Delhi", "currency": "Rupee"}}, "Africa": {"Kenya": {"capital": "Nairobi", "currency": "Shilling"}, "Rwanda": {"capital": "Kigali", "currency": "Franc"}}}

Reading from JSON to Python objects.

The json.loads() function is used to convert a  JSON string into the equivalent Python object.

Syntax:

json.loads(json_string)

The json_string argument is the JSON-formatted string that you want to parse and convert into a Python object. It should a a valid JSON string.

ExampleEdit & Run
import json

json_string = '{"name": "John", "age": 30, "city": "New York", "proffessions": ["Software Engineer", "Data analyst"]}'

python_object = json.loads(json_string)

print(python_object)
print(type(python_object))
Output:
{'name': 'John', 'age': 30, 'city': 'New York', 'proffessions': ['Software Engineer', 'Data analyst']}<class 'dict'>[Finished in 0.017804492032155395s]

To read from a JSON file we open the file using the open() function and then use the json.loads() function  in  conjunction with the reading methods of the file object such as readreadline or readlines

import json

with open('data.json', 'r') as file:
    json_string = file.read()
    python_object = json.loads(json_string)
    print(python_object)
    print(type(python_object))
Output:

{'Europe': {'France': {'capital': 'Paris', 'currency': 'Euro'}, 'Russia': {'capital': 'Moscow', 'currency': 'Rubble'}}, 'Asia': {'China': {'capital': 'Beijing', 'currency': 'Yuan'}, 'India': {'capital': 'New Delhi', 'currency': 'Rupee'}}, 'Africa': {'Kenya': {'capital': 'Nairobi', 'currency': 'Shilling'}, 'Rwanda': {'capital': 'Kigali', 'currency': 'Franc'}}}
<class 'dict'>

Alternatively, we can use the json.load()  to read and convert the data simultaneously. Unlike json.loads(), which takes a JSON-formatted string as an argument, json.load() takes a file object.

Syntax:

json.load(file_object)
import json

with open('data.json', 'r') as file:
    python_object = json.load(file)
    print(python_object)
    print(type(python_object))
Output:

{'Europe': {'France': {'capital': 'Paris', 'currency': 'Euro'}, 'Russia': {'capital': 'Moscow', 'currency': 'Rubble'}}, 'Asia': {'China': {'capital': 'Beijing', 'currency': 'Yuan'}, 'India': {'capital': 'New Delhi', 'currency': 'Rupee'}}, 'Africa': {'Kenya': {'capital': 'Nairobi', 'currency': 'Shilling'}, 'Rwanda': {'capital': 'Kigali', 'currency': 'Franc'}}}
<class 'dict'> 

The json.load() function is designed to parse a JSON file containing a single JSON object, rather than multiple JSON objects. If the file contains multiple JSON objects, you would need to read the file line by line and parse each line individually using the json.loads() function.

import json

python_objects = []

with open('file.json', 'r') as file:
    for line in file:
        python_object = json.loads(line)
        python_objects.append(python_object)

Both json.dumps() and json.dump() raises a json.decoder.JSONDecodeError exception if the json string or the files content is not correcttly formated.

ExampleEdit & Run
import json

json_string = '{language: python' # incorrectly formatted
json.loads(json_string)
Output:
Traceback (most recent call last):  File "<string>", line 4, in <module>  File "/app/.heroku/python/lib/python3.11/json/__init__.py", line 346, in loads    return _default_decoder.decode(s)           ^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/app/.heroku/python/lib/python3.11/json/decoder.py", line 337, in decode    obj, end = self.raw_decode(s, idx=_w(s, 0).end())               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "/app/.heroku/python/lib/python3.11/json/decoder.py", line 353, in raw_decode    obj, end = self.scan_once(s, idx)               ^^^^^^^^^^^^^^^^^^^^^^json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)[Finished in 0.01857511093840003s]

By incorrectly formatted we mean that the loading functions are unable to turn the json string into an equivalent python object. 

It is a good practice to use the try/except block when loading json data in cases where  the data might be incorrectly formatted. The try/except block will allow the code to continue executing even if the JSON is malformed, instead of causing an error to be thrown and the code to terminate abruptly.

import json

python_objects = []

with open('file.json', 'r') as file:
    for line in file:
        try:
            python_object = json.loads(line)
        except json.decoder.JSONDecodeError:
             continue
        python_objects.append(python_object)

Uses of JSON files.

  1. Data storage and persistence: JSON files can be used as a lightweight data storage mechanism. They provide a convenient way to store and retrieve structured data. JSON files are commonly used for storing user preferences, small databases, or caching data.

  2. Configuration files: JSON files are frequently used to store application configuration settings. These files can store various parameters and settings that can be easily read and parsed by programs.

  3. API data exchange: JSON is a popular format for exchanging data between applications over APIs (Application Programming Interfaces). APIs often send and receive data in JSON format, making it easy to serialize and deserialize complex data structures.

  4. Web development: JSON is widely used in web development for data transfer between the server and the client. Many web APIs return data in JSON format, allowing client-side code to consume and display the data.

  5. Configuration for data analysis: JSON files can be used to store configuration settings and parameters for data analysis tasks. They can specify input data sources, processing steps, and output formats, allowing for easy reproducibility and sharing of analysis workflows.