Operators for Macros in C

C language macros, whether simple or parametric, work using the mechanism of textual substitution or expansion. The macro body is simply substituted in place of the macro call. No operations are performed.

However, the C language provides two specific operators for macros that have meaning exclusively within the macro body itself. In other words, they are operators recognized by the preprocessor but not by the compiler.

These operators are:

  • The string literal conversion operator: #
  • The concatenation operator: ##

In this lesson we will see how they work and how they can be used to create more complex macros.

String Literal Conversion Operator: #

The first specific operator for macros that the C language provides is the string literal conversion operator. In English, this operator is called the stringizing operator, a fairly extravagant name that could be translated as stringization operator.

This operator consists of a single # (hash sign) and is used to convert a token into a string literal.

To understand how it works, let's see a concrete application. Suppose we want to introduce diagnostic or debug prints in our program, whose purpose is to print the name and content of some variables, let's say, of type int. The result could be an excerpt of code like this:

int a, b, c;

a = 10;
b = 20;
c = 30;

/* ... other code ... */

/* We want to verify at this point how much a is worth */
printf("The value of a is: %d\n", a);

/* We want to verify at this point how much b is worth */
printf("The value of b is: %d\n", b);

/* ... other code ... */

/* We want to verify at this point how much c is worth */
printf("The value of c is: %d\n", c);

/* We want to verify at this point how much a is worth again */
printf("The value of a is: %d\n", a);

/* ... and so on ... */

As can be observed from the code above, there is a fairly repetitive pattern. Furthermore, if we had to change the name of a variable, we would have to manually modify all the diagnostic prints. This work can be automated with the use of macros. In particular, we can use the string conversion operator to our advantage in this way:

#define PRINT_INTEGER(a) printf("The value of " #a " is: %d\n", a)

Having defined a macro in this way, we can modify the code above like this:

int a, b, c;

a = 10;
b = 20;
c = 30;

/* ... other code ... */

PRINT_INTEGER(a);
PRINT_INTEGER(b);

/* ... other code ... */

PRINT_INTEGER(c);
PRINT_INTEGER(a);

/* ... and so on ... */

Let's take the statement:

PRINT_INTEGER(a)

When the preprocessor finds this statement, it replaces it with:

printf("The value of " "a" " is: %d\n", a)

This is because #a is replaced with "a", which is a string literal. In this way, the name of variable a is printed along with its value.

The statement above is valid because, as we have seen in the chapter dedicated to strings, whenever the compiler encounters adjacent string literals, it automatically joins them. Therefore:

printf("The value of " "a" " is: %d\n", a)

is equivalent to:

printf("The value of a is: %d\n", a)

Summarizing:

Definition

String Literal Conversion Operator for Macros: #

The # operator, called string literal conversion operator or stringizing operator, is used to convert a token (or macro parameter) into a string literal.

This operator is valid only within the body of a macro.

The syntax is:

#define MACRO(parameter) macro_body #parameter macro_body

Concatenation Operator: ##

The second specific operator for macros is the concatenation operator. As the name suggests, this operator takes two tokens as input and glues them together to form a single token.

If one of the two tokens is one of the macro parameters, the concatenation occurs after the substitution.

Let's take an example. Suppose we want to define a macro that allows us to create variables with the same name, but with a numeric suffix. For example, we want to create the variables numeric_identifier_1, numeric_identifier_2, numeric_identifier_3, etc. If we wanted to proceed manually we would have to write:

int numeric_identifier_1;
int numeric_identifier_2;
int numeric_identifier_3;
/* ... and so on ... */

We can create a macro that simplifies the writing of this code in this way:

#define CREATE_ID(number) int numeric_identifier_##number

We can then use it in this way:

CREATE_ID(1);
CREATE_ID(2);
CREATE_ID(3);
/* ... and so on ... */

After the substitution performed by the preprocessor, the resulting code will be:

int numeric_identifier_1;
int numeric_identifier_2;
int numeric_identifier_3;
/* ... and so on ... */

Just like the code we would have written manually.

This is because, if we take the line CREATE_ID(1) as an example, the preprocessor first substitutes number with 1, and then joins numeric_identifier_ with 1, obtaining numeric_identifier_1.

Summarizing:

Definition

Concatenation Operator for Macros: ##

The ## operator, called concatenation operator or token-pasting operator, is used to concatenate two tokens.

This operator is valid only within the body of a macro.

The syntax is:

#define MACRO(parameter) macro_body##parameter

The concatenation occurs only after the substitution of parameters.

This operator, in fact, is not very used in practice, also because the fields of application are not many.

One possible use is to create macros that generate code automatically. Suppose we want to realize versions of the same function for different types. We could write a macro that, given a type, generates the function code for that type. For example, suppose we want to create functions that add two variables together. These functions are substantially identical except for the type as they would have a structure like this:

type sum(type a, type b) {
    return a + b;
}

Wanting to work without macros, we would have to write a function for each type:

int sum(int a, int b) {
    return a + b;
}

/* INVALID CODE: function name repeated */
float sum(float a, float b) {
    return a + b;
}

/* INVALID CODE: function name repeated */
double sum(double a, double b) {
    return a + b;
}

/* ... and so on ... */

In the code above there is a problem: the C language does not allow defining multiple functions with the same name, even if the parameter types are different. This means that the code above does not compile.

For this reason we should rewrite the code in this way:

int sum_int(int a, int b) {
    return a + b;
}

float sum_float(float a, float b) {
    return a + b;
}

double sum_double(double a, double b) {
    return a + b;
}

/* ... and so on ... */

This is a fairly repetitive and error-prone approach. We can, however, exploit parametric macros with the concatenation operator to automate the creation of these functions:

#define CREATE_SUM_FUNCTION(type) \
    type sum_##type(type a, type b) { \
        return a + b; \
    }

Thanks to this macro, we can create the functions for types int, float and double in this way:

CREATE_SUM_FUNCTION(int);
CREATE_SUM_FUNCTION(float);
CREATE_SUM_FUNCTION(double);

After the substitution performed by the preprocessor, the line CREATE_SUM_FUNCTION(int); becomes:

int sum_int(int a, int b) {
    return a + b;
}

This is because type is substituted with int and sum_##type is substituted with sum_int.

In Summary

Macros provide two operators recognized only by the preprocessor but are not valid for the compiler. These operators are:

  • The # operator allows converting a token into a string literal.
  • The ## operator allows concatenating two tokens.

These operators are useful for creating more complex macros and for automating the generation of repetitive code.

In the next lesson we will study what the general properties of macros are, whether simple or parametric.