One of the things that's been bugging me about CeePlusPlus lately is inlining. Now, there's three ways to inline functions in C++.
So if, for some reason, you've decided that inlining is actually necessary for performance goals, how the heck do you make it AsFastAsCee? Do it in C, and throw away supposed advantages? Use the inline keyword, make sure your compiler currently respects it, and hope no one changes flags later (-O2 to -Os, in gcc, for example)? Or just use always_inline, which the compiler will respect and at least will cause a warning if someone changes the compiler such that it's not respected... but then you're no longer coding standard C++.
Thoughts?
-- AdamBerger
Newer compilers can also inline functions for us based on the complexity of the function and the type of optimization that is requested. For example, VC++.NET has a switch for inlining "any suitable" function. If whole program optimization is turned on as well, the compiler may choose to inline a function defined in a .cpp file that is part of a static library. This capability means that now we can define all of our functions in the .cpp file (reducing compile time dependencies). In addition, the compiler makes different decisions about what functions to inline if the optimizer is targeting fast code or small code.
First, probably review both the actual performance measurements, the characteristics of the compiler, and lastly the code structure. For the cases I have ever bothered to look at, "inline" did exactly what it advertises doing, it put the code inline. I can only imagine inline not doing this for a fairly large code segment. If the code segment is large, the overhead added by a stack push and pop is probably not that great. In that case, then perhaps the application is running too close to the limits of the machine and a major code restructure to flatten the class hierarchy is needed. [Apologies if these are things you've already looked at.] --WayneMack
Well, the compiler generally will inline stuff with the "inline" keyword, as long as it's not too complex (or unsuitable for inlining; i.e. a recursive function). I've not personally had problems getting stuff inlined that I wanted to be inlined. Do you have a particular example in mind?
I'll second this. Compilers will generally refuse to inline recursive functions, which isn't too surprising. Use inline if you know what you are doing. If performance is that critical, you better make sure that no one tampers with optimization settings anyway.
Some more details. First off, part of the point is that the problem is with the spec... but it's a real world problem too. In particular, with gcc, the -On options (-O, -O1, -O2, -O3) all enable inlining, but -Os does not. This means that flipping a single switch (and a switch that's often flipped!) will make gcc ignore or not ignore inlining. Regarding declaring a method inside a class, my understanding is that this is semantically identical to declaring it inline, and is also just a suggestion. GCC definitely treats it that way. --AdamBerger
Would you care to tell us a source for that information? My gcc manual (3.3.5) says that inlining is turned on as soon as you specify -O, at -O3 additionally heuristic inlining of small functions is enabled.
Also there is no problem with the spec. The standard is about correctness, compiler verdors care for performance.
It might be useful to point out that -Os (optimize for space) is often incompatible with inlining. If you have a function that's called only once or twice, often times inlining is more space efficient than not; you can eliminate creating and tearing down the ActivationRecords, and an inlined function is subject to intra-procedural optimization.
However, for functions which are called in lots of places (and for almost any function declared "extern" or has it's address taken, as the compiler is in general required to keep a non-inlined copy around), not inlining is usually more space-efficient than inlining. For some very short functions (trivial constructors and the like), it may be the case that the function body when inlined is smaller than the code necessary for a function call--in that case, inlining may be the best option with -Os.
Discussion about constructors etc. moved back to original page (CeePlusPlusSlowerThanCee), since it was unrelated to inlining.