Blog Post

Azure Database Support Blog
2 MIN READ

Lesson Learned #510: Using CProfiler to Analyze Python Call Performance in Support Scenarios

Jose_Manuel_Jurado's avatar
Oct 18, 2024

Last week, while working on a support case, our customer was facing performance issues in their Python application.

After some investigation, I decided to suggest CProfiler to identify which function call was taking the most time and use that as a starting point for troubleshooting.

 

So profiling the Python code became essential to pinpoint the bottleneck. I suggested using CProfiler, a built-in Python module, which helps you profile your code and identify performance issues in real time.

 

But first, I ran some examples with my own test code to see the results.

 

In this code snippet:

  • RunCommandTimeout is a function that performs command timeouts against a SQL database.
  • cProfile.Profile() starts profiling the code within the context.
  • After execution, pstats.Stats(Profile) helps us visualize the most time-consuming calls, sorting them by cumulative time.
import cProfile
import pstats

with cProfile.Profile() as Profile:    
        RunCommandTimeout(initial_timeout=1, loop_count=5, retry=True, retry_count=3, retry_increment=4)
        results = pstats.Stats(Profile)
        results.strip_dirs().sort_stats('cumulative').print_stats(10)
         

 

I was able to track the exact function calls and time spent on each one, for example,

  • Function calls and their frequency
  • Time spent in each function (including sub-calls)
  • Cumulative time taken by functions, which is useful for spotting bottlenecks

 

   Ordered by: cumulative time
   List reduced from 181 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.357    0.357   84.706   84.706 GiveNotesPerformance.py:240(RunCommandTimeout)
        7   54.108    7.730   54.108    7.730 {method 'execute' of 'pyodbc.Cursor' objects}
        3   27.001    9.000   27.001    9.000 {built-in method time.sleep}
        4    0.001    0.000    3.025    0.756 GiveNotesPerformance.py:63(ConnectToTheDB)
        4    2.945    0.736    2.951    0.738 {built-in method pyodbc.connect}
        1    0.209    0.209    0.209    0.209 {method 'close' of 'pyodbc.Connection' objects}
       29    0.035    0.001    0.035    0.001 {method 'flush' of '_io.TextIOWrapper' objects}
       12    0.000    0.000    0.032    0.003 GiveNotesPerformance.py:526(set_text_color)
        4    0.001    0.000    0.030    0.007 GiveNotesPerformance.py:39(get_credentials_from_file)
        4    0.027    0.007    0.027    0.007 {built-in method io.open}

 

Analyzing the profiling results for the Python code execution, sorted by cumulative time, I was able to find the following results: 

 

  • I found very interesting points, such as the execute method being called 7 times with a total time of 54 seconds. This indicates that the execute method is responsible for the majority of the time spent in this execution. Each call takes an average of 7.73 seconds, suggesting that the database queries are the primary bottleneck.
  • Additionally, my time.sleep function shows that it was called to introduce delays, likely as part of the retry mechanism built into the RunCommandTimeout function. Each sleep call lasted for an average of 9 seconds.

 

Enjoy!

Published Oct 18, 2024
Version 1.0
No CommentsBe the first to comment