Python DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


Profiling and Optimizing Python

by Jeremy Jones
12/15/2005

Most of the time, code that we write doesn't have to perform as fast as if we wrote it in C. Most of the time, the first pass at writing it is "fast enough" and we don't have to optimize--but there are times when a piece of code just has to meet a certain standard of performance. For those "it's gotta run like a scalded dog" moments, the Python profiler may lend some assistance.

The Python standard library provides three libraries for performing code profiling: profile, hotshot, and timeit. The type of profiling performed in these modules involves tracing method and function calls and recording how long they took to run. profile provides detailed profiling information on all functions and methods in a code base, which are called at a specific runtime. hotshot is a replacement for the profile module. timeit provides less detailed profiling information for smaller pieces of code, where "less detailed" comprises overall execution time.

Profiling Code

One of the frequently repeated mantras I hear from Python veterans is, "Premature optimization is the root of all evil" (a quote from Donald Knuth, which apparently originated from Tony Hoare). The Pythonic approach to writing code that performs acceptably is:

  • Write the code.
  • Test it to see whether performance is OK right off the bat.
  • If so, you're done.
  • If not, optionally try Psyco.
  • Profile it and identify the bottlenecks.
  • Modify the code until the performance is as fast as you need, rewriting bottlenecks in C (or Pyrex) if necessary.

The example code I use in these examples comes from ediplex. ediplex is an open source project centered around parsing and handling EDI data. For people who don't know, EDI stands for Electronic Data Interchange and is a generic term used as a general umbrella for the means used to trade electronic business documents between two business partners. I profiled my EDI parser with hotshot by putting this code in the if __name__ == '__main__': block of the main entry point module, aggregate_parser.py:

import hotshot
prof = hotshot.Profile("hotshot_edi_stats")
prof.runcall(main)
prof.close()

calling aggregate_parser.py from the command line and passing the appropriate parameters. Interestingly, running this with hotshot takes about 25 seconds, while running it without a profiler takes a little over 3 seconds. Here is a run on the command line without the profiler, tracking the time with the Unix time utility:

jmjones@qiwi 10:10PM edi % time python \
    ~/svn/ediplex/src/ediplex/parser/aggregate_parser.py \
    edi_1000_interchanges.txt
python ~/svn/ediplex/src/ediplex/parser/aggregate_parser.py
    3.09s user 0.08s system 93% cpu 3.380 total

Here is a run on the command line with the profiler:

jmjones@qiwi 10:54PM edi % time python \
    ~/svn/ediplex/src/ediplex/parser/aggregate_parser.py \
    ~/svn/home/source/edi/edi_1000_interchanges.txt
python ~/svn/ediplex/src/ediplex/parser/aggregate_parser.py
    4.11s user 20.99s system 98% cpu 25.381 total

This run with the hotshot profiler created a log file (hotshot_edi_stats) with profiling information of each called method. Here is an IPython session showing how to get the profiling information:

In [1]: from hotshot import stats

In [2]: s = stats.load("hotshot_edi_stats")

In [3]: s.sort_stats("time").print_stats()
         286016 function calls in 3.359 CPU seconds

   Ordered by: internal time


   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   134000    1.455    0.000    2.104    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:97(body_seg)
        1    0.942    0.942    3.358    3.358 /home/jmjones/svn/ediplex/src/ediplex/parser/util/state_machine.py:22(run)
   136000    0.658    0.000    0.658    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/example/edi_handler.py:30(segment)
     1000    0.203    0.000    0.250    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:29(header_seg)
     1000    0.034    0.000    0.057    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:135(end_seg)
     7000    0.032    0.000    0.032    0.000 /usr/lib/python2.4/string.py:248(strip)
     2000    0.009    0.000    0.009    0.000 /usr/lib/python2.4/string.py:281(split)
     1001    0.006    0.000    0.006    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:28(searching_header)
     1000    0.005    0.000    0.005    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/example/edi_handler.py:24(__init__)
     1000    0.005    0.000    0.005    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/example/edi_handler.py:26(start_interchange)
     1000    0.005    0.000    0.005    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/example/edi_handler.py:28(end_interchange)
     1000    0.005    0.000    0.005    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:62(append_interchange)
        1    0.000    0.000    3.359    3.359 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:51(run)
        1    0.000    0.000    3.359    3.359 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:74(main)
        6    0.000    0.000    0.000    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/util/state_machine.py:14(add_state)
        1    0.000    0.000    0.000    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:25(add_transitions)
        1    0.000    0.000    0.000    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/x12_parser.py:10(__init__)
        1    0.000    0.000    0.000    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:17(__init__)
        1    0.000    0.000    0.000    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/util/state_machine.py:9(__init__)
        1    0.000    0.000    0.000    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/util/state_machine.py:19(set_start)
        1    0.000    0.000    0.000    0.000 /home/jmjones/svn/ediplex/src/ediplex/parser/aggregate_parser.py:40(eof)
        0    0.000             0.000          profile:0(profiler)


Out[3]: <pstats.Stats instance at 0xb798f52c>

The stats.load() call returns a pstats.Stats object. The only methods that I could find documented in the standard library for this object were print_stats(), sort_stats(), and strip_dirs(). Here, I have used only sort_stats(), which sorts the statistics based on the column specified in the argument list, and print_stats(), which prints out the statistics table displayed above. The strip_dirs() method would have displayed only the module rather than the absolute path to the module. Call me nutty, but I like seeing the absolute path to my modules. With a 17-inch display on my laptop, I don't worry about the extra output width from absolute filenames.

Pages: 1, 2, 3

Next Pagearrow





Sponsored by: