oreilly.comSafari Books Online.Conferences.


Black Box with a View, Part 2
Pages: 1, 2, 3, 4, 5, 6

Eliminating the Function Call

Example 1 used macros rather than functions to implement both LED_Init and LED_Toggle. Eliminating function calls can significantly improve the performance of embedded software.

After the compiler processes a program, each function call in the source code will require the execution of at least two machine language instructions in the object code. One is the subroutine call, and the other the subroutine return. On some microcontrollers, these instructions can be particularly expensive, taking multiple clock cycles to complete.

The body of a simple function, however, may be just one or two instructions long. Moreover, these instructions likely require fewer clock cycles than the subroutine call and return. In such cases, eliminating the function call can more than double the performance, with little or no penalty in terms of increased program size.

C compilers have good support for using macros to eliminate function calls. The macro, however, is a potentially dangerous construct. This is clear immediately from the output of the program in Example 4.

#include <stdio.h>

/* Macro to multiply a number by two.  Implemented by adding the
#define mac_times_two(a) a+a


  This is a much better way to write the above macro.

#define mac_times_two(a) ((a)+(a))

  Even better, because "a" is no longer evaluated twice.

#define mac_times_two(a) ((a)*2)


/* Function to multiply a number by two.  Implemented by adding the
   number to itself. */
int func_times_two(int a) {
  return a+a;

int main(void) {

  /*Set up two identical counter variables. */
  int count_a = 0;
  int count_b = 0;

  /* The output of the next two statements is very surprising. The
     macro and function should yield the same result, but they do
     not. */
  /* The macro and the function do not yield the same result. */
  printf("%d %d\n",mac_times_two(2)/2, func_times_two(2)/2);

  /* You should NEVER call a macro with an argument such as "++i" --
     it may be evaluated twice, giving you an unexpected error. */
  printf("%d %d\n",mac_times_two(++count_a),func_times_two(++count_b));

  return 0;

Example 4. The dangerous macro

If you want to try the example yourself, compile it with your favorite ANSI 89 or later C compiler, and run it from the command line. The output is surprising:

3 2
4 2

In a modern twist, however, much safer inline functions can replace the function-style macros of Example 1. The inline keyword in a function declaration tells the compiler to expand the function's entire body at the point of call.

This feature is part of the 1999 ANSI C (C99) standard. Even compilers based on the earlier 1989 (C89) standard, however, often support inline functions. For example, the GNU C compiler defines the __inline__ keyword, which it can recognize even when adhering to the C89 standard.

Example 5 shows a rewritten version of the LED driver code from Example 1. The new version replaces macros with inline functions, using the syntax from the C99 standard.

/* File "LED_driver.h" -- the ANSI C99 version.

   This file is very similar to the LED driver covered earlier, but it
   uses "inline" functions instead of macros.  The "inline" keyword is
   part of the 1999 ANSI C standard. */

#ifndef LED_DRIVER_H
#define LED_DRIVER_H

struct _LED {
  volatile unsigned char* reg;
  unsigned short bit;

typedef struct _LED LED_Ref[1];

/* Note that "ledp" is a constant pointer -- you cannot make it point
   to another memory location.  Declaring "ledp" constant makes the
   code safer.  You can still use "ledp" to change the data that it
   points to -- only the pointer itself is constant, NOT the data. */
inline void LED_Init(struct _LED* const ledp,
                  volatile unsigned char* reg,
                  unsigned short bit) {



inline void LED_Toggle(struct _LED* const ledp) {
  *ledp->reg ^= ledp->bit;


Example 5. Replacing macros with inline functions

Unfortunately, the compiler may ignore the inline keyword. Consequently, you must examine the compiled object code, to make sure that the inlining actually took place. First, compile the Hello World program, using the new driver from Example 5. You may want to rename the files, so that you can experiment with both versions of Hello World on your system.

$ msp430-gcc -g -std=gnu99 -pedantic -W -Wall -Os -mmcu=msp430x149 \
    -o layered.elf hello-world-layered.c

Note the use of gnu99 in place of gnu89. The -g option is another important change--it causes the compiler to output symbols from the source code into the resulting object file. This makes the object code much easier to examine.

Now, obtain a "dump" of the object code:

$ msp430-objdump -sS layered.elf

You may want to redirect the output to a file, because it is very long. The object code discussed here appears at the very end of the dump.

You should see lines from your source code interspersed with disassembled machine instructions that the compiler generated to implement those lines. Example 6 shows a small portion of the disassembly on the author's system. You can see that the compiler has, indeed, expanded LED_Init inline. (Of course, the results on your system may be different.)

int main(void) {
    1140:       31 40 fa 09     mov     #2554,  r1      ;#0x09fa

  LED_Ref led; /* Create the LED object. 

  P2DIR=0xFF;  /* Configure port 2; all pins are outputs in this case. */
    1144:       f2 43 2a 00     mov.b   #-1,    &0x002a ;r3 As==11
   code safer.  You can still use "ledp" to change the data that it
   points to -- only the pointer itself is constant, NOT the data. */
inline void LED_Init(struct _LED* const ledp,
                     volatile unsigned char* reg,
                     unsigned short bit) {
    1148:       0e 41           mov     r1,     r14     ;
    114a:       2e 53           incd    r14             ;

    114c:       b1 40 29 00     mov     #41,    2(r1)   ;#0x0029
    1150:       02 00 
    1152:       ae 43 02 00     mov     #2,     2(r14)  ;r3 As==10

  LED_Init(led,&P2OUT,2);  /* Initialize the LED object. */

  for(;;) {
    /* The following two lines implement a very crude delay loop.
       The actual length of the delay can vary significantly.
       This approach may not work with all compilers. */

Example 6. Portion of an object dump, showing function inlining

The MSP430x1xx Family User's Guide (PDF) contains detailed information on the MSP430 instruction (see Chapter 3). Unfortunately, this article cannot cover assembly language in detail--it is a complicated topic that requires an entire book.

The object dump also shows that the non-inline versions of LED_Init and LED_Toggle are present in the object code. In most cases, it is possible to eliminate these non-inline expansions to save space. See the discussion of inline functions in the GCC manual for details. Note that these links are to the GCC 3.2.3 documentation, because MSPGCC is based on GCC 3.2.3 at the time of this writing.

To conclude this section, here is a list of guidelines for replacing an ordinary function with a macro or an inline function:

  • If the function is short and simple, consider using a macro or an inline function.
  • Do not call functions from an ISR--use a macro instead. (The next section explains what ISRs are.) If you use an inline function, do a code dump after every compile, to make sure that the compiler does, in fact, expand the function inline.
  • Leave rarely called functions as normal functions.
  • Leave complex functions as normal functions--often, they cannot be expressed as macros at all. You can still use inline functions in such cases, but this may increase the code size unnecessarily.

Of course, according to these guidelines, LED_Init should be a normal function. This article leaves it as is to show the relevant coding techniques. In particular, the Example 1 version of LED_Init is a good illustration of a fairly complex macro.

Pages: 1, 2, 3, 4, 5, 6

Next Pagearrow

Sponsored by: