Python: Profile a section of code with cProfile

Artist’s rendition of your program’s call tree.

When trying to improve a slow function or module, it’s always a good idea to profile it. Here’s a snippet for quickly profiling a section of code with Python’s cProfile module, in two flavours. It’s adapted from the cProfile documentation’s Profile example. I have used versions of this snippet over the years to narrow in on performance issues.

Vanilla version

This version is good for profiling module-level code. Paste these two blocks around the target code:

import cProfile, pstats

profiler = cProfile.Profile()
profiler.enable()

...

profiler.disable()
pstats.Stats(profiler).sort_stats(pstats.SortKey.CUMULATIVE).print_stats(100)

The first block creates and starts a profiler. The second block stops it and prints the top 100 functions, sorted by cumulative call time.

For example, to profile some class definitions:

from django.db import models

import cProfile, io, pstats

profiler = cProfile.Profile()
profiler.enable()


class DanceKind(models.IntegerChoices):
    FANDANGO = 1
    TANGO = 2


class Dance(models.Model):
    kind = models.IntegerField(choices=DanceKind.choices)


profiler.disable()
out = io.StringIO()
pstats.Stats(profiler).sort_stats(pstats.SortKey.CUMULATIVE).print_stats(100)

When the module imports, you’ll see output from the print_stats() call:

$ python manage.py shell
      18305 function calls (17540 primitive calls) in 0.044 seconds

Ordered by: cumulative time
List reduced from 552 to 100 due to restriction <100>

ncalls ... cumtime  percall filename:lineno(function)
  97/2 ...   0.044    0.022 {built-in method builtins.__build_class__}
     1 ...   0.043    0.043 /.../django/db/models/base.py:95(__new__)
     6 ...   0.043    0.007 /.../django/db/models/base.py:369(add_to_class)
     1 ...   0.043    0.043 /.../django/db/models/options.py:175(contribute_to_class)
     1 ...   0.043    0.043 /.../django/utils/connection.py:14(__getattr__)
     1 ...   0.043    0.043 /.../django/utils/connection.py:56(__getitem__)
     1 ...   0.043    0.043 /.../django/db/utils.py:191(create_connection)
     1 ...   0.043    0.043 /.../django/db/utils.py:103(load_backend)
     1 ...   0.043    0.043 /.../importlib/__init__.py:108(import_module)
   2/1 ...   0.043    0.043 <frozen importlib._bootstrap>:1192(_gcd_import)
...

The table shows called functions identified by their filename, line number, and name. They are sorted by cumulative time (“cumtime”), the total amount of time spent in that function (including its callees), longest first.

Note that if the section between the two blocks raises an exception, the profiler will be left running. This is not normally a concern since module-level exceptions normally stop the program.

Context manager and decorator version

This version is good for profiling a function or part of a function. By using @contextlib.contextmanager, it can be used both as a context manager and a decorator.

First, paste this definition within your target module:

from contextlib import contextmanager


@contextmanager
def profile_and_print():
    import cProfile, pstats

    profiler = cProfile.Profile()
    profiler.enable()

    try:
        yield
    finally:
        profiler.disable()

    pstats.Stats(profiler).sort_stats(pstats.SortKey.CUMULATIVE).print_stats(100)

By using try / finally, the profiler is always disabled if an exception occurs. The profile is only printed on success, though you can change this by indenting the final line.

To use it as a function decorator:

from django.shortcuts import render

from example.models import Dance

...


@profile_and_print()
def index(request):
    context = {
        "top_dancers": get_top_dancers(),
        "total_dances": Dance.objects.count(),
    }
    return render(request, "index.html", context)

And as a context manager:

def index(request):
    context = {
        "top_dancers": get_top_dancers(),
        "total_dances": Dance.objects.count(),
    }
    with profile_and_print():
        return render(request, "index.html", context)

In either case, running the profiled block of code will again print out the profile for that section:

      4092 function calls (4012 primitive calls) in 0.008 seconds

Ordered by: cumulative time
List reduced from 376 to 100 due to restriction <100>

ncalls ... cumtime  percall filename:lineno(function)
     1 ...   0.008    0.008 /.../django/shortcuts.py:17(render)
     1 ...   0.007    0.007 /.../django/template/loader.py:52(render_to_string)
     1 ...   0.006    0.006 /.../django/template/loader.py:5(get_template)
...

Possible adjustments

Here are a few ways you might want to modify this snippet.

Sorting

To sort by other metrics, swap SortKey.CUMULATIVE for another value from this table. The most useful are:

  • SortKey.TIME for most internal time, which is time spent within the function not including calls to other functions. The slowest functions are often easy targets for optimization.
  • SortKey.CALLS for number of calls. There may be ways to eliminate repeat function calls, for example with caching.

Display

By default, the Stats class prints to standard out. To print to standard error, pass stream=sys.stderr when creating it:

pstats.Stats(profiler, stream=sys.stderr).sort_stats(
    pstats.SortKey.CUMULATIVE,
).print_stats(100)

To save to a file, swap print_stats() for dump_stats():

pstats.Stats(profiler).sort_stats(
    pstats.SortKey.CUMULATIVE,
).dump_stats("profile.stats")

You can use this file with a visualizer like tuna.

Fin

May your profiling give fulfilling results,

—Adam


Read my book Boost Your Git DX to Git better.


Subscribe via RSS, Twitter, Mastodon, or email:

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: