Introduction to concurrent programming in Python

Traditionally, a program executes its tasks in a consecutive manner, this means that one task has to be completed before another one is started. Sometimes this approach may not be desirable because of various reasons, for example some tasks may take too long before completion leading to delayed execution of other tasks. We may also need some part of a program to run in the background without blocking other tasks. This is just some of the cases where concurrent programming comes in handy.

The main goal of concurrent programming is to boost system efficiency and performance by maximizing the utilization of available system resources.

Through Concurrent programming, a program can be able to manage multiple tasks simultaneously, while maintaining safety, efficiency and data integrity. This does not mean that tasks will be started at exactly the same time but may be run in overlapping periods of time . To simply put it, a task does not necessarily need to be completed in order for another one to be started.

While concurrent programming refers to the general management of multiple tasks at the same time, the actual execution of multiple tasks simultaneously is called parallelism.

There are various ways that concurrent programming can be achieved in Python. These includes:

Multi-threading
Multi-processing
Coroutine-based concurrency(Asynchronous programming)

Threads and Multi-threading in Python

A thread refers to the smallest set of instructions that can be scheduled to be run by the processor. Multi-threading is when more than one threads are executed simultaneously. This is achieved by a single processor/core which quickly alternates between the tasks. The switching between the tasks is so fast that one may actually think that the threads are been run simultaneously, they are not.

Multi-threading in Python is primarily implemented using the threading module in the standard library.

The following example shows a simple multi-threaded Python program.

import threading
import time


# Functions to simulate time-consuming tasks
def print_numbers():
	for i in range(1, 6):
		print(f"Number {i}")
		time.sleep(0.1) # Simulate a delay in task

def print_letters():
	for letter in 'Python':
		print(f"Letter {letter}")
		time.sleep(0.1) # Simulate a delay in task

if __name__ == "__main__":
    # Create two thread for each function/task
    thread1 = threading.Thread(target=print_numbers)
    thread2 = threading.Thread(target=print_letters)

    # Start the threads
    thread1.start()
    thread2.start()

    # The main thread waits for both threads to finish
    thread1.join()
    thread2.join()

    print("Done")

Number 1
Letter P
Number 2
Letter y
Number 3
Letter t
Number 4
Letter h
Number 5
Letter o
Letter n
Done

As the above program demonstrates, the two functions are executed alternately in a way that makes them seem as if they are being executed in parallel. This makes it possible to execute time consuming tasks without delaying the execution of other tasks.

Advantages of threading

Threads offers an excellent approach for executing blocking tasks or ones that needs to run in the background such as input and output(I/O).
Threads share same processor/core and thus communication between them is easier due to shared resources.
Threads uses less resources compared to other concurrent approaches such as multi-processing.

Processes and Multi-processing

As we have seen, threads accomplishes multi-tasking using a single core. Most modern computer usually have more than one CPU's and cores and thus threads may not leverage the full potential of the available computation power.

Processes works just like threads except that they are not bound to a single CPU/core. We can choose multi-processing over multi-threading if we want to use all the available CPU's and cores.

In Python, multi-processing is primarily achieved through the multiprocessing module in the standard library. Consider the following example:

#import multiprocessing module
import multiprocessing
import time

#Tasks that require a lot of computation power
def task1():
    
    for i in range(6):
        print(i)
        time.sleep(0.1)


def task2():
    
    for i in "PYTHON":
        print(i)
        time.sleep(0.1)

if __name__ == "__main__":
    
    #Create two process, one for each task
    p1 = multiprocessing.Process(target=task1)
    p2 = multiprocessing.Process(target=task2)
    
    #Start the process
    p1.start()
    p2.start()
    
    #The main process waits until the processes are complete.
    p1.join()
    p2.join()

    print("Done!")

0
P
1
Y
2
T
3
H
4
O
5
N
Done!

As you can see in the above example, the two tasks are actually being executed in parallell.

Advantages of processes

Processes can make better use of multi-core processors.
The entire program is not affected if a single process crashes.
Process are better at handling CPU-intensive tasks.

Multi-processing usually leads to improved performance, however, as there is no direct communication between the multiple processors, there needs to be a form of inter-process communication which in some cases may cause delays leading to counter-productivity.

Coroutine-based concurrency

Coroutines are functions whose execution can be suspended and then later resumed. This should sound familiar if you already have a basic knowledge of generator functions.

Through coroutines, we can achieve concurrency in what is referred to as Asynchronous programming. It is implemented through a single thread where the running coroutine periodically gives control to another coroutine thus making it possible for multiple tasks to be run alternately.

Python offers a builtin support for asynchronous programming through keywords such as async and await. The asyncio module in the standard library provides the necessary tools for implementing and managing asynchronous tasks.

import asyncio

#a task to print even numbers
async def task1():
    for i in range(10):
        if i % 2 == 0:
           print(i)
           await asyncio.sleep(0.1) # cause a small delay

# a task to print odd numbers
async def task2():
    for i in range(10):
        if i % 2 == 1:
           print(i)
           await asyncio.sleep(0.1) # cause a small delay

     
async def main():
    await asyncio.gather(task1(), task2())

if __name__ == "__main__":
    asyncio.run(main())

Concurrency