Monday, December 9, 2013

Force inline functions in C++(GCC)


Inline functions

In C++, inline function invocations will get replaced by the complete body of the function, similar to macros. This step performed by compiler is called inline expansion. Major advantage of inline function is that there is no overhead of a function call. Hence there is a performance boost.

Compiler considers all functions whose definitions are in the header, for inline expansion. For member methods, which are not define in header files, we can request compiler to consider for inline. Use the inline keyword in the function declaration

inline int calculate(int x, int y);

The above methods are for requesting compiler to consider a function for inline expansion. Compiler can choose to inline or not. When do we need to force compiler to inline always? Compiler heuristically decide which functions are worth inlining. Size of a function plays a major role. If you have a function, which could let your compiler decide not to inline because of its size and if its getting frequently invoked, then its a good candidate for force inlining.

Force Inline

In GCC, you can use the following compiler attribute to always inline a function __attribute__((always_inline))

__attribute__((always_inline)) int calculate(int x, int y);

GCC Performance consideration for inline

Using force inline is not a good idea always.
Compiler usually makes good choice, GCC provides a set of performance options related to inline
  • -fno-inline
  • Do not expand any function except the once which are forced inline.
    This is the default when no optimizations are performed.
  • -finline-small-functions
  • Integrate functions into their callers when their body is smaller than expected function call code (so overall size of program gets smaller).
    The compiler heuristically decides which functions are simple enough to be worth integrating in this way.
    This inlining applies to all functions, even those not declared inline.
    Enabled at level -O2.
  • -findirect-inlining
  • Inline also indirect calls that are discovered to be known at compile time thanks to previous inlining.
    This option has any effect only when inlining itself is turned on by the -finline-functions or -finline-small-functions options.
    Enabled at level -O2.
  • -finline-functions
  • Consider all functions for inlining, even if they are not declared inline.
    The compiler heuristically decides which functions are worth integrating in this way.
    If all calls to a given function are integrated, and the function is declared static, then the function is normally not output as assembler code in its own right.
    Enabled at level -O3.
  • -finline-functions-called-once
  • Consider all static functions called once for inlining into their caller even if they are not marked inline.
    If a call to a given function is integrated, then the function is not output as assembler code in its own right.
    Enabled at levels -O1, -O2, -O3 and -Os.
  • -fearly-inlining
  • Inline functions marked by always_inline(force inline) and functions whose body seems smaller than the function call overhead early before doing -fprofile-generate instrumentation and real inlining pass.
    Doing so makes profiling significantly cheaper and usually inlining faster on programs having large chains of nested wrapper functions.
    Enabled by default.
  • -finline-limit=n
  • By default, GCC limits the size of functions that can be inlined.
    This flag allows coarse control of this limit. n is the size of functions that can be inlined in number of pseudo instructions.
    Pseudo instruction represents, in this particular context, an abstract measurement of function's size.
    In no way does it represent a count of assembly instructions and as such its exact meaning might change from one release to an another.

No comments :