Pure and Impure Functions and Side Effects in C

A function can have local variables on which to operate that are visible only within itself. However, it is also possible that a function in C can modify global variables. This is an example of a side effect.

Since a function call can be used within an expression, this concept has important repercussions on the final result.

For this reason, in this lesson we will introduce some fundamental concepts: pure functions and impure functions. We will see what Side Effects are and how they can influence the final result of an expression.

The Concept of Pure Function and Impure Function

In previous lessons we have studied functions and the mechanisms to define and call them. We have also explored the concepts of local variables, static variables and global variables; in other words the scope of variables.

With these concepts clear in mind, we can introduce a fundamental concept, at first glance only theoretical, that has important repercussions on C code development: pure functions and impure functions.

A function in C language is said to be pure if it respects two conditions:

  1. Given equal arguments passed as parameters it returns always the same result;
  2. It does not modify the global state of the program.

Let's analyze these two properties individually.

Let's start with a first example of a pure function: a function that calculates the length of the hypotenuse of a right triangle given the length of the two legs.

#include <math.h>

double hypotenuse(double leg1, double leg2) {
    double result = sqrt(leg1 * leg1 + leg2 * leg2);
    return result;
}

Now, we can invoke this function in this way:

double length = hypotenuse(3.0, 4.0);

The value of length will be 5.0. But this result will always be the same given equal arguments passed to the function. In other words, if we invoke the function hypotenuse(3.0, 4.0) the result will always be 5.0 regardless of the context and how many times we invoke the function.

The variable result is a local variable invisible outside the hypotenuse function. This means that the hypotenuse function does not modify the global state of the program.

But what is meant by global state of the program? It means the state of global variables and static variables. If a function modifies a global variable or a static variable, then the function is not pure.

Definition

Global State of a Program

The global state of a program is composed of:

  1. Global variables and their current content;
  2. Static variables and their current content.

Let's take an example of an impure function:

int counter = 0;

int function(int x) {
    counter += x;
    return x + 1;
}

In this case, it is true that the function function always returns x + 1, but it modifies the global state of the program by incrementing the value of the global variable counter. Therefore the second condition is violated.

Similarly, if a function reads the global state of the program, then it is not pure. For example:

int global_variable = 10;

int function(int x) {
    return x + global_variable;
}

In this case, the function function reads the value of the global variable global_variable. Even if it does not modify the global state, reading the global state is sufficient to make the function impure because it is enough that the value of the global variable changes to obtain a different result. Therefore the first condition is violated.

Definition

Pure Function

A function is said to be pure if:

  1. Given equal arguments passed as parameters it returns always the same result;
  2. It accesses only local variables.

The name pure derives from the fact that a function of this type approaches very much the concept of mathematical function.

In practice a function in C language and a mathematical function are different entities. A mathematical function is a relationship between two sets of values, and cannot return different values given equal arguments.

Let's take, for example, the mathematical functions sine and cosine. These functions always return the same value given equal arguments:

\sin(\pi) = 0
\cos(\pi) = -1

The sine and cosine functions will always return these values given equal arguments. This is the concept of pure function.

Similarly, the C functions that calculate the sine and cosine, respectively sin and cos, always return the same value given equal arguments. These functions are therefore pure functions.

But in C language we can also realize, as seen above, impure functions. These functions, while not being mathematical, are still useful for creating complex programs.

Definition

Impure Function

A function is said to be impure if it violates one of the two conditions of pure functions.

I/O Functions are Impure by Definition

There is another case in which functions can be impure: the case in which they make use of I/O functionalities, or input/output.

Examples of I/O are writing/reading from console, writing/reading to file, connection to a database, etc.

A classic example is scanf which reads from console. This function reads from console and returns the value read. This value depends on the state of the console, therefore the scanf function is impure.

int x;

scanf("%d", &x);

The printf function is also impure because it writes to console.

printf("Hello, World!\n");

In general, all functions that read or write from/to the outside are impure.

Definition

Functions that Use I/O are Impure

Functions that use I/O functionalities are impure:

  • Reading/Writing from/to console;
  • Reading/Writing to file;
  • Connection to a database;
  • Network connection.

and so on.

Side Effects

Now that we have a clear concept of Pure Function and Impure Function we can resume and deepen a concept that we have already encountered previously: side effects.

We have already studied the fact that operators can have side effects. In other words, there are operators that not only read the value of their operands but modify them.

For example:

int x = 10;
int y = 20;

int c = (++x) - (y--);

In this example, the ++ operator increments the value of x and the -- operator decrements the value of y. These operators have side effects on their operands.

The concept of Side Effect can be extended to the case of modification of the global state of the program.

In particular an impure function has side effects because it modifies the global state of the program. Which does not happen for pure functions.

Definition

Side Effect for Functions

A function is said to have a Side Effect when it has any observable effect beyond reading the values of parameters and returning a value.

Therefore the second condition of pure functions can be reformulated as:

Definition

Second Condition of Pure Functions

Necessary condition for a function to be pure is that it has no side effects.

Order of Evaluation of Expressions and Side Effects

At first glance, the concepts of pure function, impure function and Side Effect seem only theoretical concepts. Definitions that are ends in themselves.

However, there exists in C language as well as in other programming languages an important implication concerning the order of evaluation of expressions.

We have already addressed, in a previous lesson, that among the operators of an expression C language imposes an order of evaluation.

For example, let's take the following expression:

int x = 10;
int y = 20;
int z = 30;

int result = x * y + z;

In this case, the * operator has precedence over the + operator. Therefore the expression x * y is evaluated before the expression x * y + z.

The C standard imposes very precise rules on operator precedence and all compilers must respect them.

However, the C standard does not impose any rule on the order of evaluation of operands. Until now, we have not yet encountered this problem, because our expressions contained only variables. Evaluating an operand that is a variable means, simply, reading its content. But what happens if we mix expressions and function invocations?

Let's take an example of an expression that contains multiple function calls:

int result = function1(x) * function2(y) + function3(z);

In this case, the order of evaluation of the expression, according to C rules, will be the following:

  1. First the multiplication: function1(x) * function2(y);
  2. Then the addition: (function1(x) * function2(y)) + function3(z);
  3. Finally the assignment: result = ((function1(x) * function2(y)) + function3(z)).

But the C standard stops here. If we take the multiplication:

function1(x) * function2(y)

before we can perform it, we must evaluate the operands, that is we must invoke the two functions function1 and function2 and take their results. But in what order will the two functions be executed?

Unfortunately the C standard does not impose any rule in this regard. One could make the assumption that operands are evaluated from left to right, that is function1(x) is evaluated first and then function2(y). But this is not necessarily the case.

In fact, a compiler could apply optimizations and execute the two functions in a different order.

What are the consequences?

Here the concepts of pure function and impure function come into play. If the functions function1, function2 and function3 are pure functions, then the order of evaluation of expressions will have no importance. In fact, given equal arguments passed, the result will always be the same.

This does not happen if the functions are impure because they could have side effects.

Let's take an example:

int counter = 0;

int function1(int x) {
    counter += x;
    return counter + 1;
}

int function2(int x) {
    counter -= x;
    return counter + 2;
}

/* ... */

int result = function1(10) * function2(20);

In this case, the value of result depends on the order of evaluation of the two functions:

  • If function 1 is evaluated first:

    • counter becomes 10;
    • function1(10) returns 11;
    • counter becomes -10;
    • function2(20) returns -8;
    • result becomes 11 * -8 = -88.
  • If function 2 is evaluated first:

    • counter becomes -20;
    • function2(20) returns -18;
    • counter becomes -10;
    • function1(10) returns -11;
    • result becomes -11 * -18 = 198.

The order of evaluation has had impacts on the final result.

The problem is precisely that the two functions have side effects on the program: both modify the global variable counter.

This example shows us how the concept of pure function and impure function is not only theoretical but has important repercussions on C code development.

In general it is true that:

Definition

Evaluation of an Expression and Side Effects

The order of evaluation of the operands of an expression in C language is not defined.

However, it is irrelevant for the purposes of the result if the following conditions are met:

  1. Only pure functions are used;
  2. The result of operators with side effects, such as ++ and --, is not used.

If the two conditions above are not met we could obtain different results from those expected.

Therefore, we would like to give two recommendations:

Hint

Avoid Combining Impure Functions in Expressions

The first recommendation is to avoid using impure functions in expressions, or, if absolutely necessary do not use more than one impure function in an expression.

Hint

Impose an Order of Evaluation of Operands

If we are really forced to use more than one impure function in an expression, we can impose an order of evaluation of operands.

To do this we can use temporary variables.

Going back to the example above, to be sure that function1 is evaluated before function2 we can write:

int result1 = function1(10);
int result2 = function2(20);

int result = result1 * result2;

In this way, we are sure that function1 is evaluated before function2.

Conclusions

In this lesson we have deepened the concept of pure function and impure function. We have seen that a pure function is a function that always returns the same result given equal arguments and does not modify the global state of the program.

We have seen that impure functions, instead, violate one of the two conditions of pure functions.

We have seen that functions that make use of I/O functionalities are impure by definition.

We have introduced the concept of side effect for functions and we have seen that a function has a side effect when it has any observable effect beyond reading the values of parameters and returning a value.

Finally, we have seen that the order of evaluation of expressions in C language is not defined and that, if impure functions are used, there could be important consequences on the final result.

In the next lesson we will study a fundamental concept concerning functions in C language: the Call Stack.