Previous Chapter Next Section Table of Contents Glossary Index

Chapter 12. Profiling

12.1. Using the Linux oprofile system-level profiler

oprofile is a system-level profiler that's available for most modern Linux distributions.

Use of oprofile and its companion programs isn't really documented here; what is described is a way of generating symbolic information that enables profiling summaries generated by the opreport program to identify lisp functions meaningfully.

12.1.1. Generating a lisp image for use with oprofile

Modern Linux uses the 'ELF" (Executable and Linking Format) object file format; the oprofile tools can associate symbolic names with addresses in a memory-mapped file if that file appears to be an ELF object file and if it contains ELF symbol information that describes those memory regions. So, the general idea is to make a lisp heap image that looks enough like an ELF shared library to fool the oprofile tools (we don't actually load heap images via ELF dynamic linking technology, but we can make it look like we did.)

12.1.2. Prerequisites

  • oprofile itself, which is almost certainly available via your distribution's package management system if not already preinstalled.

  • libelf, which provides utilities for reading and writing ELF files (and is likewise likely preinstalled or readily installable.) Somewhat confusingly, there are two libelf implementations in widespread use on Linux, and different distributions refer to them by different names (they may be available as part of an 'elfutils' package.) The oprofile insterface was designed to work with a libelf implementation whose version number is currently around 147; the other (incompatible) libelf implementation has a version number around 0.8. It may be necessary to install the corresponding development package (-dev or -devel, usuallly) in order to actually be able to use the libelf shared library.

12.1.3. Generating ELF symbols for Lisp functions

In order to create a lisp heap image which can be used for oprofile- based profiling, we need to:

  1. load any code that we want to profile

  2. generate a file that contains ELF symbol information describing the names and addresses of all lisp functions.

    This step involves doing (from within Clozure CL)

    ? (require "ELF")
    "ELF"
    ("ELF")
    
    ? (ccl::write-elf-symbols-to-file "home:elf-symbols")
    	    

    The argument to CCL::WRITE-ELF-SYMBOLS-TO-FILE can be any writable pathname. The function will do whatever's necessary to nail lisp functions down in memory (so that they aren't moved by GC), then write an ELF object file to the indicated pathname. This typically takes a few seconds.

  3. Generate a lisp heap image in which the ELF symbols generated in the previous step are prepended.

    The function CCL:SAVE-APPLICATION provides a :PREPEND-KERNEL argument, which is ordinarily used to save a standalone application in which the kernel and heap image occupy a single file. :PREPEND-KERNEL doesn't really care what it's prepending to the image, and we can just as easily ask it to prepend the ELF symbol file generated in the previous step.

    ? (save-application "somewhere/image-for-profiling"
        :prepend-kernel "home:elf-symbols")
    	    

    If you then run

    shell> ccl64 somewhare/image-for-profiling
    	    

    any lisp code sampled by oprofile in that image will be identified "symbolically" by opreport.

12.1.4. Example

;;; Define some lisp functions that we want to profile and save
;;; a profiling-enabled image.  In this case, we just want to 
;;; define the FACTORIAL funcion, to keep things simple.
? (defun fact (n) (if (zerop n) 1 (* n (fact (1- n)))))
FACT
? (require "ELF")
"ELF"
("ELF")
? (ccl::write-elf-symbols-to-file "home:elf-symbols")
"home:elf-symbols"
? (save-application "home:profiled-ccl" :prepend-kernel "home:elf-symbols")

;;; Setup oprofile with (mostly) default arguments.  This example was
;;; run on a Fedora 8 system where an uncompressed 'vmlinux' kernel
;;; image isn't readily available.

;;; Note that use of 'opcontrol' generally requires root access, e.g.,
;;; 'sudo' or equivalent:

[~] gb@rinpoche> sudo opcontrol --no-vmlinux --setup

;;; Start the profiler

[~] gb@rinpoche> sudo opcontrol --start
Using 2.6+ OProfile kernel interface.
Using log file /var/lib/oprofile/samples/oprofiled.log
Daemon started.
Profiler running.

;;; Start CCL with the "profiled-ccl" image created above.
;;; Invoke "(FACT 10000)"

[~] gb@rinpoche> ccl64 profiled-ccl 
Welcome to Clozure Common Lisp Version 1.2-r9198M-trunk  (LinuxX8664)!
? (null (fact 10000))
NIL
? (quit)

;;; We could stop the profiler (opcontrol --stop) here; instead,
;;; we simply flush profiling data to disk, where 'opreport' can
;;; find it.

[~] gb@rinpoche> sudo opcontrol --dump

;;; Ask opreport to show us where we were spending time in the
;;; 'profiled-ccl' image.

[~] gb@rinpoche> opreport -l profiled-ccl | head
CPU: Core 2, speed 1596 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        symbol name
6417     65.2466  <Compiled-function.(:INTERNAL.MULTIPLY-UNSIGNED-BIGNUM-AND-1-DIGIT-FIXNUM.MULTIPLY-BIGNUM-AND-FIXNUM).(Non-Global)..0x30004002453F>
3211     32.6487  <Compiled-function.%MULTIPLY-AND-ADD4.0x300040000AAF>
17        0.1729  <Compiled-function.%%ONE-ARG-DCODE.0x3000401740AF>
11        0.1118  <Compiled-function.%UNLOCK-RECURSIVE-LOCK-OBJECT.0x30004007F7DF>
10        0.1017  <Compiled-function.AUTO-FLUSH-INTERACTIVE-STREAMS.0x3000404ED6AF>
7         0.0712  <Compiled-function.%NANOSLEEP.0x30004040385F>
7         0.0712  <Compiled-function.%ZERO-TRAILING-SIGN-DIGITS.0x300040030F3F>
	

12.1.5. Issues

CCL::WRITE-ELF-SYMBOLS-TO-FILE currently only works on x86-64; it certainly -could- be made to work on ppc32/ppc64 as well.

So far, no one has been able to make oprofile/opreport options that're supposed to generate call-stack info generate meaningful call-stack info.

As of a few months ago, there was an attempt to provide symbol info for oprofile/opreport "on the fly", e.g., for use in JIT compilation or other incremental compilation scenarios. That's obviously more nearly The Right Thing, but it might be awhile before that experimental code makes it into widespread use.


Previous Chapter Next Section Table of Contents Glossary Index