A Yogi's Guide to Debug Python Programs

A Yogi's Guide to Debug Python Programs

There are many resources on how to write code but not many on how to debug. In this article, I am highlighting my approach to debug both synchronous and asynchronous Python programs.

Approach to debugging

Using IDE:

Running a program in debug mode in an IDE like PyCharm, or Vscode is the easiest way to debug for someone who loves debugging using IDE-assisted features. It provides information regarding the state of objects and available attributes for objects which makes debugging easier. You can set a breakpoint at any given line in your IDE and start debugging and jumping steps repeatedly until the bug is found and fixed.

The print statements or logging module

When I started programming, I mostly used good old print statements and/or the logging module to know the state of the program in each line. I used to insert random print statements to debug the problems, with funny print or logging statements.

The best debugging tool is still careful thought, coupled with judiciously placed print statements.
- Brian W. Kernighan

Rubber Duck

The gist of this debugging approach is to verbally explain the problem that you are facing to someone else, it may be a rubber duck or a colleague(if you are lucky). While explaining the problems to others, our understanding also gets better which will help in connect the dots required for solving problems.

REPL

REPL(Read, Evaluate, Print, and Loop), or the way of providing Python statements directly to the interpreter console to evaluate the result. This approach saves time as you can just evaluate the Python statements rather than executing the whole Python file. REPL is mostly useful when dealing with standard modules or just trying to find the results of common data types related functions and the results. REPL to evaluate the results of standard modules/datatypes is a convenient approach to understanding the behavior of underlying operations.

Using dir to look up all attributes available for modules, objects, and data types is still my favorite thing. The REPL below shows the use of dir to string module and set data type.

>>> import string
>>> dir(string)
['Formatter', 'Template', '_ChainMap', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_re', '_sentinel_dict', '_string', 'ascii_letters', 'ascii_lowercase', 'ascii_uppercase', 'capwords', 'digits', 'hexdigits', 'octdigits', 'printable', 'punctuation', 'whitespace']
>>> a={1,2,3}
>>> dir(a)
['__and__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__iand__', '__init__', '__init_subclass__', '__ior__', '__isub__', '__iter__', '__ixor__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__or__', '__rand__', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__rsub__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__xor__', 'add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update']
>>> b={3,4,5}
>>> a.intersection(b)
{3}

The Python debugger or pdb

This is something that I have used primarily in my software engineering career to debug Python programs. The module pdb ( breakpoint in recent Python versions) temporarily stops the execution of a program and lets you interact with the states of a program. You can insert a breakpoint on any line and move over to the next statement to find and fix problems. Combining pdb prompt with dir is a match made in heaven when it comes to debugging. The Python debugger(pdb) has a set of commands like n,c, l that you can refer here to use within the debugging prompt.

❯ python test_requests.py
> /tmp/test_requests.py(6)<module>()
-> print(response.text)
(Pdb) l
  1      import requests
  2
  3      response = requests.get('https://example.org/')
  4
  5      import pdb; pdb.set_trace()
  6  ->    print(response.text)
[EOF]
(Pdb) dir(response)
['__attrs__', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_content', '_content_consumed', '_next', 'apparent_encoding', 'close', 'connection', 'content', 'cookies', 'elapsed', 'encoding', 'headers', 'history', 'is_permanent_redirect', 'is_redirect', 'iter_content', 'iter_lines', 'json', 'links', 'next', 'ok', 'raise_for_status', 'raw', 'reason', 'request', 'status_code', 'text', 'url']
(Pdb) response.headers
{'Content-Encoding': 'gzip', 'Accept-Ranges': 'bytes', 'Age': '411952', 'Cache-Control': 'max-age=604800', 'Content-Type': 'text/html; charset=UTF-8', 'Date': 'Sun, 10 Sep 2023 19:57:02 GMT', 'Etag': '"3147526947+gzip"', 'Expires': 'Sun, 17 Sep 2023 19:57:02 GMT', 'Last-Modified': 'Thu, 17 Oct 2019 07:18:26 GMT', 'Server': 'ECS (nyb/1D07)', 'Vary': 'Accept-Encoding', 'X-Cache': 'HIT', 'Content-Length': '648'}
(Pdb)

The pdbpp or ipdb Python packages available in PyPI to enhances the debugging experience further.

Traceback module

I have used the standard traceback module to figure out the sequence of functions and their order of execution leading to the exception. It aids in debugging by displaying detailed information on the call stack, line numbers, and source of error. It is useful when the exception is handled without exposing many details of the error.

import requests
import traceback

def print_response():
    try:
        response = requests.get('https://thisdoesnot.exist/')
    except Exception as e:
       traceback.print_exc()
       print("Request failed")
       return
    print(response.text)

def main():
    print("inside main")
    print_response()

main()

AI-Assisted debugging

LLM tools like Chat GPT, GitHub copilot, Codium AI, etc. are quite good at explaining and even generating code. It can be leveraged while debugging as it can sometimes provide valuable insights into the bug or issue faced.

Debugging Asynchronous Python Programs

Debugging synchronous programs is hard, but debugging asynchronous programs is harder.
As mentioned in python documentation, we can enable asyncio debug mode for easier debugging of asynchronous programs.

Ways to enable asyncio debug mode:

These are the benefits of using asyncio debug mode:

  • Finds not awaited co-routines and logs them

  • Shows execution time of coroutine or I/O selector

  • Shows callbacks that take more than 100ms by default. We can also change this time by using loop.slow_callback_duration to define the minimum seconds to consider slow callbacks

In addition to the above debug mode, there are tools like aiomonitor , which inspects the asyncio loop and provides debugging capabilities. This exposes the telnet server to provide REPL capabilities, to inspect the state of asyncio application. This can also be used while debugging async programs within a docker container or in a remote server.

Whatever the bug, debugging always requires mental clarity just like a yogi. Don't stress and happy debugging!! 🐍

References:

  1. Developing with asyncio - Python documentation

  2. Python Development Mode - Python documentation

  3. aiomonitor’s documentation

Feel free to comment and If you learned something from this article, please share it with your friends.

You may also follow me on LinkedIn and Twitter .

Did you find this article valuable?

Support Shiva Gaire by becoming a sponsor. Any amount is appreciated!