Understanding Python GIL: The Global Interpreter Lock Explained
Introduction to Python GIL
The Global Interpreter Lock (GIL) in Python is a mutex (or a lock) that allows only one thread to execute in the Python interpreter at any given time. This means that even if you have multiple threads, only one can execute Python bytecode at a time. The GIL is necessary because CPython, the standard Python interpreter, is not thread-safe.
While the GIL simplifies memory management and improves performance in single-threaded programs, it can be a bottleneck for multi-threaded programs, especially those that are CPU-bound.
This tutorial will help you understand the GIL through practical examples.
Example 1: Demonstrating the GIL with CPU-bound Tasks
This example demonstrates how the GIL affects multi-threaded Python programs when running CPU-bound tasks.
Code:
import threading
import time
# A function that performs a CPU-bound operation (e.g., counting)
def cpu_bound_task(n):
while n > 0:
n -= 1
# Measuring time for multi-threaded execution
def multi_threaded_execution():
thread1 = threading.Thread(target=cpu_bound_task, args=(100000000,))
thread2 = threading.Thread(target=cpu_bound_task, args=(100000000,))
start_time = time.time() # Start time measurement
thread1.start() # Start the first thread
thread2.start() # Start the second thread
thread1.join() # Wait for the first thread to finish
thread2.join() # Wait for the second thread to finish
end_time = time.time() # End time measurement
print(f"Multi-threaded execution time: {end_time - start_time:.2f} seconds")
# Running the multi-threaded execution
multi_threaded_execution()
Output:
Multi-threaded execution time: 9.13 seconds
Explanation:
- CPU-bound Task: The function 'cpu_bound_task(n)' performs a simple CPU-bound operation by decrementing 'n' in a loop.
- Multi-threaded Execution: Two threads are created to run ‘cpu_bound_task’ simultaneously. However, due to the GIL, they don't execute in parallel, and the total time is longer than expected.
- Output: The GIL restricts true parallel execution, so multi-threading doesn't lead to a speedup for CPU-bound tasks.
Example 2: Using Multiprocessing to Bypass the GIL
This example shows how to use the multiprocessing module to achieve true parallelism for CPU-bound tasks, bypassing the GIL.
Code:
from multiprocessing import Process
import time
# A function that performs a CPU-bound operation (e.g., counting)
def cpu_bound_task(n):
while n > 0:
n -= 1
# Measuring time for multi-processing execution
def multi_process_execution():
process1 = Process(target=cpu_bound_task, args=(100000000,))
process2 = Process(target=cpu_bound_task, args=(100000000,))
start_time = time.time() # Start time measurement
process1.start() # Start the first process
process2.start() # Start the second process
process1.join() # Wait for the first process to finish
process2.join() # Wait for the second process to finish
end_time = time.time() # End time measurement
print(f"Multi-processing execution time: {end_time - start_time:.2f} seconds")
# Running the multi-processing execution
multi_process_execution()
Output:
Multi-processing execution time: 20.90 seconds
Explanation:
- Multiprocessing: The 'multiprocessing' module creates separate processes, each with its own Python interpreter and GIL. This allows true parallel execution of CPU-bound tasks.
- Performance Improvement: Using multiple processes often results in a significant performance improvement for CPU-bound tasks compared to multi-threading.
Example 3: GIL and I/O-bound Tasks
This example demonstrates how I/O-bound tasks (like network or file I/O) are less affected by the GIL because the GIL is released during I/O operations.
Code:
import threading
import time
# A function that simulates an I/O-bound task using sleep
def io_bound_task():
print("Starting I/O-bound task...")
time.sleep(2) # Simulate I/O operation (e.g., file read, network call)
print("I/O-bound task completed.")
# Measuring time for multi-threaded execution of I/O-bound tasks
def io_multi_threaded_execution():
thread1 = threading.Thread(target=io_bound_task)
thread2 = threading.Thread(target=io_bound_task)
start_time = time.time() # Start time measurement
thread1.start() # Start the first thread
thread2.start() # Start the second thread
thread1.join() # Wait for the first thread to finish
thread2.join() # Wait for the second thread to finish
end_time = time.time() # End time measurement
print(f"Multi-threaded I/O-bound execution time: {end_time - start_time:.2f} seconds")
# Running the I/O-bound task in multiple threads
io_multi_threaded_execution()
Output:
Starting I/O-bound task...Starting I/O-bound task... I/O-bound task completed. I/O-bound task completed. Multi-threaded I/O-bound execution time: 2.00 seconds
Explanation:
- I/O-bound Task: The function 'io_bound_task()' simulates an I/O operation using 'time.sleep()'.
- Multi-threaded Execution: Multiple threads can run in parallel for I/O-bound tasks because the GIL is released during I/O operations.
- Outcome: The total time taken is approximately equal to the time taken by the longest individual task, demonstrating that threads can effectively overlap I/O waits.
Example 4: Measuring GIL Effect with Python timeit Module
This example measures the effect of the GIL on multi-threading performance using the 'timeit' module.
Code:
import threading
import timeit
# Function to perform a CPU-bound operation
def cpu_bound_operation():
sum(i * i for i in range(1000000))
# Function to run multi-threaded CPU-bound operations
def run_multi_threaded():
threads = []
for _ in range(2):
thread = threading.Thread(target=cpu_bound_operation)
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
# Measure execution time using timeit
execution_time = timeit.timeit(run_multi_threaded, number=1)
print(f"Execution time with threads: {execution_time:.2f} seconds")
Output:
Execution time with threads: 0.31 seconds
Explanation:
- 'timeit' Module: Measures the execution time of the 'run_multi_threaded' function.
- Multi-threading with GIL: Due to the GIL, the performance gain of using threads for CPU-bound tasks is minimal or non-existent.
Example 5: Impact of GIL on Python Performance
This example compares the performance of Python code with and without GIL for CPU-bound tasks.
Code:
import time
import threading
# Function to perform CPU-bound operation
def cpu_task(n):
total = 0
for i in range(n):
total += i * i
# Running the CPU-bound task in a single-threaded manner
def single_threaded():
start_time = time.time()
cpu_task(100000000) # Perform the task once
cpu_task(100000000) # Perform the task again
end_time = time.time()
print(f"Single-threaded execution time: {end_time - start_time:.2f} seconds")
# Running the CPU-bound task in a multi-threaded manner
def multi_threaded():
start_time = time.time()
thread1 = threading.Thread(target=cpu_task, args=(100000000,))
thread2 = threading.Thread(target=cpu_task, args=(100000000,))
thread1.start()
thread2.start()
thread1.join()
thread2.join()
end_time = time.time()
print(f"Multi-threaded execution time: {end_time - start_time:.2f} seconds")
# Running both single-threaded and multi-threaded tests
single_threaded()
multi_threaded()
Output:
Single-threaded execution time: 18.15 seconds Multi-threaded execution time: 20.93 seconds
Explanation:
- Single-threaded vs Multi-threaded: Demonstrates that multi-threading does not necessarily provide a performance gain for CPU-bound tasks in Python due to the GIL.
- Execution Time Comparison: The GIL restricts multi-threaded programs from achieving significant speed-ups in CPU-bound tasks.
Example 6: Understanding GIL with Python Bytecode
This example demonstrates how Python bytecode execution is serialized due to the GIL.
Code:
import dis
# Function to perform a simple addition
def add(a, b):
return a + b
# Disassemble the function to show its bytecode
print("Bytecode of 'add' function:")
dis.dis(add)
Output:
Bytecode of 'add' function: 5 0 LOAD_FAST 0 (a) 2 LOAD_FAST 1 (b) 4 BINARY_ADD 6 RETURN_VALUE
Explanation:
- 'dis' Module: Disassembles the function add and shows its 'bytecode'.
- Serialized Execution: The bytecode execution is serialized under the GIL, meaning only one thread can execute Python bytecode at a time.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://198.211.115.131/python/python-global-interpreter-lock-with-examples.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics