Modern C++ - The preprocessor
4. Preprocessor
Preprocessor directives
Preprocessing is the first steps of the C++ translation process.
- Trigraph replacement (trigraph characters are replaced).
- Line splicing (lines ending with escaped newlines are spliced).
- Tokenization (and replacing comments with whitespace).
- Macro expansion and directive handling.
Digraphs and trigraphs
The required character set of the programming language and the usual input devices (e.g. keyboard) may differ.
Not only in C++, C and Pascal also supports digraphs:
(. --> [
.) --> ]
(* --> {
*) --> }
C++ language trigraphs:
??= --> #
??/ --> \
??' --> ^
??( --> [
??) --> ]
??! --> |
??< --> {
??> --> }
??- --> ~
1 printf("How are you??-I am fine") --> printf("How are you~I am fine")
2 printf("How are you???-I am fine") --> printf("How are you?~I am fine")
3
4 // Will the next line be executed??????/
5 a++; --> // Will the next line be executed???? a++;
6
7 /??/
8 * A comment *??/
9 / --> /* A comment */
Digraphs handled in tokenization (step 4.)
<: --> [
:> --> ]
<% --> {
%> --> }
%: --> #
Include directive handling
#include <filename>
Search filename in the standard compiler include path
#include "filename"
Search filename in the current source directory
$ g++ -I/usr/local/include/add/path1 -I/usr/local/include/add/path2 ...
Extend the include path from command line
The preprocessor replaces the line with the text of the filename.
1 #include <iostream>
2 int main(void)
3 {
4 std::cout << "Hello, world!" << std::endl;
5 return 0;
6 }
Macro definition
There are object-like and function-like macros.
Object-like macro:
#define <identifier> <token-list>
Examples:
1 #define BUFSIZE 1024
2 #define PI 3.14159
3 #define USAGE_MSG "Usage: command -flags args..."
4 #define LONG_MACRO struct MyType \
5 { \
6 int data; \
7 };
Usage:
1 char buffer[BUFSIZE];
2 fgets(buffer, BUFSIZE, stdin);
Function-like macro
#define <identifier>(<param-list>) <token-list>
Examples:
1 #define FAHR2CELS(x) ((5./9.)*(x-32))
2 #define MAX(a,b) ((a) > (b) ? (a) : (b))
Usage:
1 c = FAHR2CELS(f);
2 x = MAX(x,-x);
3 x = MAX(++y,z);
Undefine macros
1 #undef BUFSIZE
Conditional compilation
In some cases (e.g. configuration parameters, platforms, etc.) we need to separate different compilation cases. We can do this with conditional compilation.
In the condition expression, all non-zero values mean true, zero value means false.
Sample: Configuration on debug level:
1 #if DEBUG_LEVEL > 2
2 fprint("program was here %s %d\n", __FILE__, __LINE__);
3 #endif
Sample: configuration on OS platform:
1 #ifdef __unix__ /* __unix__ is usually defined by compilers for Unix */
2 # include <unistd.h>
3 #elif defined _WIN32 /* _Win32 is usually defined for 32/64 bit Windows */
4 # include <windows.h>
5 #endif
We can use complex (but preprocessor-time computable) expressions for the condition.
1 #if !(defined( __unix__ ) || defined (_WIN32) )
2 // ...
3 #else
4 // ...
5 #endif
Error
In some cases we want to force stop the compilation.
1 #if RUBY_VERSION == 190
2 # error 1.9.0 not supported
3 #endif
Line
Line directive is used mostly for debug purposes, it sets the line number for the compiler (and for the LINE macro).
#line 567
Header guards
Heared guard is the most frequently used preprocessor pattern in C++. Its purpose is to avoid multiple inclusion:
1 #ifndef MY_HEADER_H
2 #define MY_HEADER_H
3 /*
4 * content of header file
5 *
6 */
7 #endif /* MY_HEADER_H */
Since C++11 we have an alternative solution:
1 #pragma ones
2 /*
3 * content of header file
4 *
5 */
Standard defined macros
1 __FILE__
2 __LINE__
3 __DATE__
4 __TIME__
5
6 __STDC__
7 __STDC_VERSION__
8 __cplusplus
To check whether we have a standard C++ compiler which runs the preprocessor. Mostly used for library headers which can be included into either C or C++ code.
1 #ifdef __cplusplus
2 extern C {
3 #endif
4 // ...
5 #ifdef __cplusplus
6 }
7 #endif
Pragma
Pragmas have compiler-defined effect.
1 #pragma warning(disable:4786)
Token stringification
For very tricky macro-s we need to create a string from an originally non-string value (like a token or a substritution result of an earlier macro).
1 #define str(s) #s
2 #define BUFSIZE 1024
3 // ...
4 str(\n) --> "\n"
5 str(BUFSIZE) --> 1024
Token concatenation
When two string literal ar separated with the ## characters, the separator whitespaces are discarded and the string literals are concatenated.
This is used both for creating long string literals and for hacking template-like macros.
1 struct my_int_20_array
2 {
3 int v[20];
4 };
5 struct my_int_30_array
6 {
7 int v[30];
8 };
9 struct my_double_40_array
10 {
11 double v[40];
12 };
13
14 #define DECLARE_ARRAY(NAME, TYPE, SIZE) \
15 typedef struct TYPE##_##SIZE##_array \
16 { \
17 TYPE v[SIZE]; \
18 \
19 } NAME##_t;
20
21 DECLARE_ARRAY(yours,float,10);
22 yours_t x, y;