Defining Macros in C
In the previous lesson, we studied the role of the preprocessor in writing a program in the C language. We saw that the preprocessor is a program that runs before compilation and takes care of replacing preprocessing directives with their content.
So far, in this guide to the C language, we have used the #define
preprocessing directive to create simple macros that allow us to define constants. These macros are also called simple macros because they do not accept parameters.
In this lesson, we will study in detail the syntax and behavior of simple macros using the #define
directive.
Later, in upcoming lessons, we will also study function-like macros called parametric macros.
Simple Macros
A macro gets its name from the contraction of macro-block or macro-instruction. In other words, it is a set of instructions, functions, or expressions that are grouped and labeled with a symbolic name. This symbolic name is then used wherever we want to reuse that block.
Conceptually, a macro may resemble a function. After all, the goal of a macro is to make it easier to reuse repeated code blocks. However, the mechanism of macros is very different from that of functions. While functions are executed during program execution, macros are replaced by the preprocessor before compilation.
Specifically, the macro mechanism is known as text substitution. In technical jargon, we say that the macro is expanded. Let’s look at this in more detail.
The syntax to define a simple macro is as follows:
#define macro_name macro_body
Where:
macro_name
is the symbolic name of the macro;macro_body
is the text that will replace the name.
The important thing to note is that we used the term text to replace. In fact, the text or body of the macro can be any sequence of characters. This means that a macro can be used to replace anything, not just statements or expressions.
In this sense, a macro differs from a function: using a macro means replacing the symbolic name with its body. For this reason, macros have enormous potential and are widely used in the C language.
So far, we have used macros in our programs essentially to create constants. In their original manual, the creator of the C language, Dennis Ritchie, and Brian Kernighan recommend using macros to define what they call Manifest Constants. That is, using macros for those constants that never change during program execution.
A typical use case is to define numerical constants like
#define PI 3.14159265358979323846
#define E 2.71828182845904523536
#define METERS_IN_A_MILE 1609.344
#define SECONDS_IN_AN_HOUR 3600
This way, whenever we need to use one of these constants in our program, we can use the macro’s symbolic name. For example:
float circle_area(float radius) {
return PI * radius * radius;
}
When the preprocessor examines this piece of code, it replaces the symbolic macro name with its body. So the code above, once the macro is expanded, becomes:
float circle_area(float radius) {
return 3.14159265358979323846 * radius * radius;
}
The term expansion comes from the fact that replacing the macro’s name with its body increases and expands the original source code.
But this is just one possible use of macros. In fact, since the body of the macro can be arbitrary text, we can use them for more complex substitutions. For example, going back to the function above, suppose we frequently need to use the area of a circle with radius 2 in our program. Rather than repeatedly calling the circle_area
function with parameter 2, we can define a macro that directly replaces the function call. For example:
#define CIRCLE_AREA_RADIUS_2 circle_area(2.0)
This way, in our program, we can use the symbolic macro name CIRCLE_AREA_RADIUS_2
. The preprocessor will replace the symbolic name with its body. For example:
printf("The area of a circle with radius 2 is: %f\n", CIRCLE_AREA_RADIUS_2);
becomes:
printf("The area of a circle with radius 2 is: %f\n", circle_area(2.0));
Note that the replacement happens at the text level. The preprocessor does not replace the macro with the result of the function call. So the circle_area
function is still invoked every time we use the macro's symbolic name.
To recap:
Macros in C Language
In the C language, a macro is a symbolic name given to a block of text known as the macro body.
The C preprocessor replaces, or in technical terms expands, the symbolic name of the macro with its body whenever it encounters an occurrence in the source code.
The syntax to define a simple macro, i.e., a macro without parameters, is as follows:
#define macro_name macro_body
Notes on Macros
At this point, two important observations need to be made.
The first is that a macro does not end with a semicolon ;
. As we’ve mentioned, a macro is not a statement. It is a block of text that will be substituted for a symbolic name. This is something to be cautious about, as it can lead to unexpected results. Let’s revisit the earlier example:
#define CIRCLE_AREA_RADIUS_2 circle_area(2.0);
We’ve added a semicolon at the end of the macro. In some cases this might not cause issues. For example, if we use the macro to assign a value to a variable:
float area = CIRCLE_AREA_RADIUS_2;
In that case, the preprocessor will transform the code like this:
float area = circle_area(2.0);;
The extra semicolon is not a problem because the compiler interprets it as an empty statement. However, if we use the macro in another instruction, it might lead to compilation errors. For example:
printf("The area of a circle with radius 2 is: %f\n", CIRCLE_AREA_RADIUS_2);
In this case, the preprocessor will transform the code like this:
/* ERROR */
printf("The area of a circle with radius 2 is: %f\n", circle_area(2.0););
Unfortunately, in this case, the compiler will report a syntax error. The semicolon ;
is not valid in this context.
Macro definitions do not end with a semicolon
A macro definition does not end with a semicolon ;
.
Adding a semicolon at the end of a macro definition can cause compilation errors and should be used with caution.
So what should we do if we want to write a macro on multiple lines? The solution is to use the \
character to indicate that the macro continues on the next line. For example:
#define PI_DIVIDED_BY_2 \
3.14159265358979323846 / 2
The macro ends at the end of the last line that does not contain the \
character. So, in our example, the macro ends at the end of the second line. The backslash \
is ignored.
Multiline Macros
Normally, in the C language, a macro ends at the end of the line on which it is defined. However, if we want to define a macro across multiple lines, we can use the \
character to indicate that the macro continues on the next line.
The syntax is as follows:
#define macro_name macro_body \
macro_body \
macro_body \
final_macro_body
The second observation is hidden in the first. We mentioned that the underlying mechanism of macros is text substitution. Therefore, the body of the macro is taken as-is and substituted for the symbolic name. This means that no syntactic check is performed on the macro body. This is a fundamental point and can lead to errors that are difficult to detect.
For example, from the preprocessor's point of view, a macro like this is perfectly legal:
#define STRANGE_MACRO !*%$
And in fact, the preprocessor will not report any error. Furthermore, as long as it’s not used in our code, even the compiler will not raise any errors. That’s because the preprocessor has not performed any substitution. Things change as soon as we try to use it. For example:
int main(void) {
int a = STRANGE_MACRO;
return 0;
}
The preprocessor will expand the macro and the code becomes:
int main(void) {
int a = !*%$;
return 0;
}
At this point, the compiler will report a syntax error because !*%$
is not a valid expression. The problem is that the error is reported by the compiler at the point where the macro is used, not where it was defined. This makes it harder to trace the error back to the macro definition itself.
In other words, since the preprocessor does not perform any syntactic check on the macro body, we can only detect syntax errors when the macro is actually used in code.
The lack of syntactic checks on the macro body may be disconcerting. However, as we will see, it is an aspect that can be exploited to create very powerful macros.
To summarize:
Text Substitution and Syntax Checking
Macros are replaced by the preprocessor with their body. This mechanism is known as text substitution.
The preprocessor performs no syntactic checking on the macro body. Therefore, any syntax errors will only be reported when the macro is used in the code.
Uses of Macros
Now that we are more familiar with macro definitions, we can move on to the core question: what are macros used for?
We’ve already seen that macros can be used to define constants, the so-called manifest constants. So let’s ask ourselves: what are the actual benefits of using them? Let’s take a closer look:
-
They make programs more readable.
This is an undeniable advantage. For example, when using numeric constants, using a symbolic name instead of the actual value often makes the meaning of a statement much clearer. Compare the following two statements:
float area = 3.14159265358979323846 * radius * radius;
float area = PI * radius * radius;
The second statement is certainly more readable at first glance. So, using a constant is definitely better than using a magic number that seems to have come out of nowhere.
-
They make programs easier to modify.
This benefit is a direct consequence of the previous one. Suppose we need to use a numeric constant in several parts of our program. For example, assume we need to use the same numeric coefficient in multiple calculations:
float x1 = 4.523 * y1; /* .... */ float x2 = 4.523 * y2; /* .... */ float x3 = 4.523 * y3;
We might later need to change that constant from
4.523
to something else. For example:float x1 = 5.123 * y1; /* .... */ float x2 = 5.123 * y2; /* .... */ float x3 = 5.123 * y3;
The problem here is that we would need to manually modify multiple points in our program. This can be a problem in large programs and is error-prone: a simple oversight by the developer could lead to a mistake.
We can solve this problem by using a macro. For example:
#define COEFFICIENT 4.523 float x1 = COEFFICIENT * y1; /* .... */ float x2 = COEFFICIENT * y2; /* .... */ float x3 = COEFFICIENT * y3;
Now, if we need to change the value of the coefficient, we only need to change the macro in one place:
#define COEFFICIENT 5.123 float x1 = COEFFICIENT * y1; /* .... */ float x2 = COEFFICIENT * y2; /* .... */ float x3 = COEFFICIENT * y3;
Furthermore, by using macros in this way, we avoid inconsistencies. Suppose that in our program we need to perform calculations using π. The number π is a transcendental number with infinitely many decimal digits. For our calculations, we must limit it to a finite number of digits by truncating the number.
If we don’t use a macro, we would have to repeat the π value in multiple places, possibly introducing different truncation errors. For example:
float area = 3.14159 * radius * radius; /* .... */ float circumference = 2 * 3.141592 * radius;
In the first case, we used only 5 decimal places, while in the second, we used 6. The result is that our calculations could become inconsistent. This doesn’t necessarily mean the results are incorrect, just that the numerical precision differs.
It’s much simpler to use a macro:
#define PI 3.141592 float area = PI * radius * radius; /* .... */ float circumference = 2 * PI * radius;
This way, if we need to change the number of decimal digits, we only have to update the macro’s value.
However, macros can be used for more complex purposes. In this lesson, we’ll look at a few of them, leaving the more advanced uses for upcoming lessons.
Modifying the Syntax of the C Language
One of the most powerful—and, for beginners, most surprising—uses of macros is the ability to play with the syntax of the C language. In other words, by leveraging the text substitution mechanism, it is possible to modify the very syntax of C itself. Of course, there are limits, but impressive results can be achieved. For this reason, this technique should be used with caution.
The modifications we can apply to the language can be so simple that they are considered aesthetic, or so complex that they are considered semantic.
One of the simplest cases is replacing basic syntactic elements of the language. For example, we can use a macro to replace the {
character with the keyword BEGIN
and the }
character with the keyword END
, just like in the Pascal language:
#define BEGIN {
#define END }
This way, we can write our code like this:
int main(void) BEGIN
int a = 0;
int b = 1;
int c = a + b;
END
The modifications can be more complex as well. For instance, as we saw in the lesson on the for
loop, it’s possible to create an infinite loop like this:
for (;;) {
/* ... */
}
We can use a macro to make this code more readable. For example:
#define LOOP for (;;)
#define BEGIN {
#define END }
Now we can write our code like this:
LOOP BEGIN
/* ... */
END
And we can go even further. For instance, we can create a macro to exit the infinite loop:
#define EXIT_LOOP break
This allows us to write:
LOOP BEGIN
/* ... */
if (condition) {
EXIT_LOOP;
}
END
In short, it’s almost as if we are creating a new programming language. However, as mentioned earlier, this capability must be used carefully. Doing so can make our code unreadable to those unfamiliar with this kind of syntax. Additionally, it could lead to hard-to-find errors.
Opaque Types
Another use—often overused—of macros directly stems from the capability we just discussed. It involves renaming a data type. In other words, assigning a new name to an existing data type.
For example, as we saw in earlier lessons, the C language does not have a boolean type, unlike many other programming languages. In C, any value other than 0 is considered true.
However, we can simulate the presence of a boolean type by using an alias for the int
type. We can do this using a macro. For example:
#define BOOL int
#define TRUE 1
#define FALSE 0
Here we have used three macros. The first macro, BOOL
, is an alias for the int
type. The other two, TRUE
and FALSE
, are aliases for the values 1
and 0
.
Now we can use these macros in our code like this:
BOOL condition = TRUE;
/* ... */
if (a < 5) {
condition = FALSE;
}
/* ... */
if (condition) {
/* ... */
}
It is not uncommon to find C programs written this way. For example, if we examine programs written for the Windows operating system, we may find code like this:
DWORD ThreadProc(LPVOID lpParameter) {
/* ... */
}
In this case, DWORD
is an alias for the type unsigned long
, and LPVOID
is an alias for void *
. On Windows, this style of programming—using type aliases—is strongly encouraged.
In general, when an alias is used for a data type, we refer to it as an Opaque Type. These are often used by frameworks and libraries when the internal structure of a data type is to be hidden. This is not for secrecy, but for abstraction purposes.
For example, in C, the FILE
type is an opaque type, as we will see in the lessons on input/output. In reality, it is an alias for a pointer. From the programmer's perspective, however, it’s easier to refer to a FILE
than to a pointer.
Opaque Type
In the C language, an opaque type is a data type whose actual structure is hidden from the end user—that is, the programmer using it. Opaque types are often used to achieve code abstraction.
To define an opaque type, a macro is used to create an alias for an existing data type:
#define opaque_type_name existing_data_type
Macro Naming Style
We’ve seen that macros can be used for a wide range of purposes. However, as mentioned earlier, macros must be used with care. When misused, they can lead to unexpected behavior and hard-to-find bugs.
For this reason, even though it’s not mandatory, many programmers prefer to name macros in a way that clearly indicates that they are macros and not function or variable names. The most common and widespread approach is to write macro names entirely in uppercase.
In our examples, we have consistently followed this naming style:
#define PI 3.14159265358979323846
#define BOOL int
#define TRUE 1
#define FALSE 0
This way, whenever we see a name written in uppercase, we instantly know it refers to a macro.
In reality, the C language does not enforce any specific rules for macro naming. We can use lowercase, uppercase, or a combination of both. Nevertheless, we believe that adopting this style is good practice.
Macro Names
It is good practice to use uppercase names for macros. This makes it easier to recognize them within the code.
#define MACRO_NAME macro_body
In Conclusion
After introducing the purpose and behavior of the preprocessor, in this lesson we dove into its practical use by studying macros in the C language.
Macros essentially consist of a name, or label, and a body that can be any block of text. Macros differ from functions in that they are based on the mechanism of textual substitution. In other words, the preprocessor replaces the macro name with its body. A macro is not invoked but is instead expanded.
That said, in this lesson we’ve explored the main uses of simple macros. A simple macro is one that does not accept parameters. In this lesson, we saw that we can use them to:
- Define constants;
- Group repeated blocks of code;
- Modify the syntax of the C language;
- Create aliases for data types—so-called opaque types.
These uses already allow us to write efficient and readable programs, improving our productivity.
However, a macro can also accept parameters, just like a function. In that case, we speak of function-like macros. These provide the programmer with unmatched power and flexibility. We’ll study function-like macros in the next lesson.