Cee Preprocessor Statements

Most of the C and C++ preprocessor statements I have seen smell. -- JasperPaulsen

Here are some of the accepted ways to use CeePreprocessorStatements:

  1. IncludeGuards. IncludeGuards smell, because they exist solely to protect duplicate code (in header files) from being triplicated. This is a LanguageSmell, not a CodeSmell. That is, if you use C++, you are better off using include guards than not using them, because the language design forces you to.

  2. RedundantIncludeGuards. Enough said.

  3. In the implementation of assert macros, to prevent a performance hit at run-time. This is an excellent practice, except that it means that any otherwise useful code performed by the assert statement becomes an undesirable side-effect.

  4. Macros. In-line functions work just as well. Functions don't bomb the program if they are not wrapped in enough parentheses. Functions can be stepped into and debugged; most debuggers cannot step into macros. Most importantly, function calls always evaluate their parameters only once. (Try using 'x++' in a 'min(,)' macro. Surprise!)

  5. To create enumerated lists (e.g. of Windows dialog controls), without creating an enum. (Yuck!)

  6. To comment out blocks of code from the compiler. (We can't use actual comments because in C++ comments don't nest. We can't use "if(false)" because the commented-out code may not compile, and because the removed block may be bigger than a single function.)

  7. Stringify code. For example, stringifying a failed assert expression.

  8. Conditional compilation. Used in a variety of ways. For example:
    1. Enable Multiple Character Types -- compile differently based on a static character type of Single-Byte, Multi-Byte, or Double-Byte.
    2. Enable Multiple Platform Builds -- compile differently based on platform. For example, typedef a Unicode character type as uint16 #ifdef _WIN32 but as uint32 #ifdef _UNIX.
    3. Enable Multiple Build Grades -- change the verbosity or rigor of debug code based on the BUILD_GRADE. For example, a build may have the following grades: DEBUG, INTEG1, INTEG2, and RELEASE.

  9. Defining booleans in C. (Why not use "enum bool { false, true };" instead?) [Because true in C is any non-zero value it makes more sense to #define FALSE (0!=0) / #define TRUE !FALSE]

  10. Getting the file name and line into a debug statement (__FILE__ and __LINE__). This is especially useful when the target system's debugger is incompetent. However, using this creates some interesting UnitTesting problems (using (__LINE__ + 2) for the expected error statement, for instance).

Here are some less accepted, but still very common, uses of CeePreprocessorStatements:

  1. To make it easier to use the debug version of the program than to use the release version of the program. For example, to create a backdoor around the password protection system. If the developers think that the system is too much of a hassle to use regularly, shouldn't they make it easier to use for everybody? Plus, if the developers don't use a feature, it is much less likely to be properly tested. -- generally this is done in order to get to the piece of functionality that needs testing without wading through all of the startup steps on the way there. However, if you're unit testing properly, you shouldn't need to do this.

  2. Many kinds of configuration management where a variation in the program can and should be fixed at compile time.



Discussion:

I implemented rudimentary reflection in C++ using macros. The preprocessor is my friend. I especially enjoy the # and ## operators. On the other hand, having read the Apache source code and the MFC headers, the #ifdefs are brutal. Trade offs, trade offs. Might as well outlaw gotos while you're at it. -- SunirShah

Actually, I'm with Stroustrup on this. The C preprocessor is nasty. There is very little to enjoy about the # and ## operators. In general, the C preprocessor does little but expose the deficiencies in the CeePlusPlus language. -- RobertDiFalco

Well, it's certainly not the best thing ever. Scheme and Lisp have better macro support. Imports are better than includes. But it's workable and powerful and certainly usable to get things done during the day. The same apologetics are used for Java, I know, but at least most of C++ was thought out. (And Java is in desperate need of a macro system, too.) Besides, the preprocessor is just an inheritance from C. Two competing forces: elegance vs. backwards compatibility. -- SunirShah

That would just be horrible if Java added a preprocessor! Ugh! How is Java in desperate need of one? For stringifying assert expressions? Big deal! The problem with the C preprocessor is that most people over-use it. They end up using it to create mini-languages or object-models that are difficult to maintain and usually over-complex. The C preprocessor should be something one uses when there is simply no other choice. -- RobertDiFalco

Often a problem with Java in constrained environments such as embedded devices is that the some of the class library or even some of the language (e.g. reflection) is not available on the platform. Sometimes, the best choice to minimize footprint is to conditionally compile. -- SunirShah

java.lang.Reflection is actually part of the libraries, not the language.

I'm not sure what you are saying, but you can conditionally exclude byte-code in Java using static finals. --RobertDiFalco

True. Though the compiler would still have to evaluate the code block even if it was not emitted. One problem might be making use of java.lang.* classes that don't exist on a impoverished platform. I have had to write code that didn't use all of java.lang.* because it wouldn't compile in certain environments. Unfortunately, since I don't have a cross-compiler handy, I cannot verify or demonstrate my experience. -- SunirShah


IncludeGuards smell, because they exist solely to protect duplicate code (in header files) from being triplicated. - Can you explain what is wrong with wanting to do this, or explain a better way of doing it? Move to the IncludeGuard page if you like. -- DaveHarris

The smell shows a design flaw in C and C++: Linking code together efficiently requires duplicate code (in header files). Given this design flaw, IncludeGuards are the best work-around I know of. -- JasperPaulsen

Exactly, and Stroustrup often talks about this flaw. He's often lamented his disappointment in the need for the CeePlusPlus precompiler. -- RobertDiFalco

Try doing your include guards like this:

 #ifndef ___MY_H
 #define ___MY_H
 #else
 #error Multiple includes of MY.H are not allowed
 #endif

Talk about enforcing discipline! -- EdwardKiser

This is not a good idea. See LargeScaleCppSoftwareDesign for a related discussion. --DirckBlaskey

I see include guards as a way of reducing coupling. If a .h file must see the definition in another, then there are 2 possibilities: either you require the user of the .h to include the dependencies, or you need to have the .h file include another. There are good arguments for each but, in general, I favour not forcing the user of a class to know its dependencies. This limits the scope of changes because it reducing the coupling. --DaveWhipp

All C and C++ standards committees have to do is agree on a proper module system for these languages. It is not impossible to do. Only their never-ending white-knuckled grip on holding on to the past prevents them from overcoming the "C as a better assembly language" mentality. It is not a crime to borrow ideas from NiklausWirth! Introducing a new module Foo {...} keyword isn't going to destroy the planet. Moreover, introducing a new import Bar.Baz as Bletch; construct will not affect the galactic alignment. We have been waiting for these features for decades!!!! Please, please, please, please, please, please, please, please, please move on! This is the 21st century for Christ's sake!


If someone attaches a smell to a general language feature (e.g. the C preprocessor) then we are in an extreme form of LanguagePissingMatch ("this whole language smells"). This should not be done. Languages have weaknesses, it's our job to avoid them. A smell is something that can be avoided because there is a better way to do it. There is no way in C to avoid the preprocessor, so the C preprocessor can't smell. Only a certain use of the preprocessor can smell, if there is a better alternative. -- HelmutLeitner

I see the C preprocessor feature as similar to a "goto" statement: it is a low-level language feature that can be used to fake high-level stuff like a module system. In general, one should use the high-level feature if available. If not available, one should stick to a highly stylized idiom that simulates the high-level feature. And complain loudly to the C standardisation committee. -- StephanHouben


The preprocessor was a very good thing for the "C" language. It is largely useless in the C++ language, and should generally be avoided.

Macros used as Inline Functions:
Use inline functions. Inline functions evaluate their parameters only once. Inline functions are easier to debug. Inline functions have a more rational syntax -- the definition does not have to "all be on one line." Inline functions are type safe and can be overloaded. Inline functions conform to name spaces and scoping rules. Use inline functions.

Macros used as Constants:
Use enum or 'static const'. These are type safe.

Function Pointers:
Use 'class' and 'virtual'. (OK, it's not a preprocessor feature; I just threw it in here.)

The only remaining acceptable uses of the preprocessor in C++ are...


Well, it's certainly not the best thing ever. Scheme and Lisp have better macro support.

You should not compare the C preprocessor macros with LispMacros. The former are simple textual substitutions [Note however that the C preprocessor operates on a sequence of tokens, not on a sequence of characters.] the latter are full fledged programs which can arbitrarily re-write the expressions they are working on. [I'm actually not sure if SchemeMacros allow this, but LispMacros do.]


Okay.. So the preprocessor is evil. I'd like to change these MIN & MAX macros to C++ inline functions. I'm not especially worried about MIN(a++,b++) etc, because everyone knows about the double increment of the lesser variable, but there are MIN(vec1.len(),vec2.len()) or other function calls which could be bad (perhaps optimized away though). So I'm thinking 'What better way is there to do it than to use a template function?'

I'm going from #define MAX(a,b) (((a)>(b))?(a):(b)) into template <class T> inline T MIN(T a, T b) { return a<b?a:b; } and rebuild the product. But there's problems ahoy! MIN(type1, type2) is not defined. For example double and float, or double and char, double and int, etc etc... Overloading is a hassle because there's so many types.

How can you overcome this problem with this incredibly simple macro?

Well, you should use std::min or std::max from <algorithm>, rather than rolling your own. Of course, you can't call these methods with arguments of different types, but you can explicitly select a specialization. Example stolen from Josuttis:

int I ;
long L ;

// std::max(I,L) // This is an ERROR std::max<long>(I,L) // This is allowed.

AndreiAlexandrescu wrote an article on max as well. http://www.cuj.com/experts/1904/alexandr.htm?topic=experts


Moved from TrivialDoWhileLoop:

If one writes a macro, the discussion above shows how to let other people safely use the macro in their code.

Macros should not be used unnecessarily. Some disadvantages of macros:


Macros can do some things functions can't do. Some people would rather not do those things. Among them:


Speed


Exception Safety

Consider

  X* f() { return new X; }
  Y* g() { return new Y; }

try { do_something(f(), g()); }

This code can be exception safe. However, if you replace f() and g() with equivalant macros, then it cannot. This is because the C++ standard explicitly does not require naked "new" operations to completely create one object before starting on the next. So, in the macro-based implementation, the compiler is free to allocate memory for both X and Y before calling either of the constructors. If the first of the constructors (we don't know which will be first) throws an exception, then the memory for the other might not be reclaimed. When the "new" operations are encapsulated in functions, this optimization is not permitted. (Which means that even inline functions might be slower than macros).


Debugging


I fully understand people's distaste for the C preprocessor, but nonetheless I can't fathom living without it, either. The above stuff all applies even when using C++, for that matter; they didn't come up with enough features to completely displace it.


Language idioms are things that all experienced programmers in that language learn.

[In my experience, this is not a common C language idiom.]

[My primary audience (after the compiler, of course) contains many junior programmers. Some are less bright than others, but they all deserve code that is as easy to understand as I can make it.]


IMHO the C preprocessor is the WORST macro processor in wide use.

Probably. Just for the record, for years it wasn't considered part of the C language per se, and was advocated, designed, and implemented by different people than the rest of C: Alan Snyder, Mike Lesk, and John Reiser. See Ritchie's history of C http://cm.bell-labs.com/cm/cs/who/dmr/chist.html

Incidentally, skipping past the design of the C preprocessor for a moment, Reiser's implementation of it is an amazing case study in how to make software run really, really, really fast. In one environment I used in the 1980s, it was faster than "cat file.c > /tmp/foo"! ("cat" uses getchar() and the stdio lib was inefficient)


The GNU coding standards recommend if {...} else {...} instead of #ifdef ... #else ... #endif http://www.gnu.org/prep/standards/standards.html#Conditional-Compilation

Which is perhaps fine for FSF posix-only "free" code, but doesn't work well when one of the branches uses undefined identifiers, which occurrs quite often when you're doing portability.


I use not only the C preprocessor, I program in the preprepreprocessor!

Here is one example, which is used to automatically unroll loops. Preprocessor commands can also be included in the unrolled loops.

 @m Repeat@T
 ``yg_Repeat
 @)
 @m _Repeat@T
 J@
 A-g_Repeat
 @)
 @m Repeat@W
 q/@tRepeat
 Bq``: zyqnB" times:\ @>
 qA"@#
 @)

Here is another example. \addkind is a TeX macro, defined earlier in the program, which sorts and prints a summary of the entries.
 @mkind@T
 ``k"@<Data of kinds@>=
 "#define kind (kind_data[_kindnum])
 "#undef _kindnum
 :#z:dz:ez:fz:iz:nz:ez: z:_z:kz:iz:nz:dz:nz:uz:mz: z
 :0z:xzu;qA"strcpy(kind.name+1,
 :"zqu;qBz");kind.character=
 :0z:xzqu;qA";kind.color=
 :0z:xzqu;qA";kind.menu=
 qu;qA";kind.selection_key=
 qu;q:'zBz";kind.cycle=
 qu;qB";
 qgkind_
 @)
 @mkind_@T
 xJ@
 z"_kind_flag;
 gkind_
 @)
 @mkind@W
 "\addkind%
 B";;%
 @)
 @mname@-
 ``YJ"@
 q/@d
 : zBqA
 @)


CeeRefactorStringsToFunctions

CategoryCee CategoryCpp CeeIdioms


EditText of this page (last edited August 27, 2010) or FindPage with title or text search