Recursive Functions in C

Recursion is a fundamental concept in computer science and programming. A recursive function is a function that calls itself.

The C language allows implementing recursive functions in a simple and direct way. In this lesson we will see how to do it and what are the implications of recursion in C.

Recursion

We have seen that in C language a function can call another function which, in turn, can call another one and so on. This mechanism is the basis of code modularity and allows us to decompose more complex problems into simpler problems.

However, there is also the possibility that a function can call itself. This mechanism is called recursion and allows solving complex problems in an elegant and concise way.

A recursive function, in fact, is a function defined in terms of itself.

Many programming languages are based completely on recursion for problem solving. Other languages, instead, forbid it by not allowing recursive function calls.

In this sense, the C language is a programming language that allows recursion, but does not encourage it. This is because recursion can lead to performance and memory problems, if not used correctly.

The classic example to introduce the concept is that of the factorial function.

The factorial function is mathematically defined as:

n! = n \cdot (n-1) \cdot (n-2) \cdot \ldots \cdot 2 \cdot 1

But it can be rewritten in recursive form, that is in terms of itself:

n! = \left\{ \begin{array}{ll} 1 & \text{if } n \leq 1 \\ n \cdot (n-1)! & \text{otherwise} \end{array} \right.

In this case, the factorial function is defined in terms of itself: the factorial of n is equal to n multiplied by the factorial of n-1.

Any recursive function can also be written in iterative form, that is without using recursion and vice versa. The same applies to any recursive algorithm, which can be rewritten in iterative form and vice versa.

From this point of view, the factorial function represents precisely a classic example of this duality.

Let's try, therefore, to implement a first version of the factorial function in iterative form, that is using a for loop:

int iterative_factorial(int n) {
    int factorial = 1;
    for (int i = 1; i <= n; i++) {
        factorial *= i;
    }
    return factorial;
}

In this case, the iterative_factorial function calculates the factorial of an integer number n in an iterative way, multiplying all numbers from 1 to n. We have used, that is, the first mathematical definition of the factorial seen above.

Let's now try to implement the factorial function in recursive form:

int recursive_factorial(int n) {
    if (n <= 1) {
        return 1;
    } else {
        return n * recursive_factorial(n - 1);
    }
}

The first observation is that the function in its own body calls itself at a certain point. No particular syntax is necessary. This is one of the characteristics of recursion in C: no particular construct is necessary to call a recursive function. In other languages instead, you must explicitly declare a function as recursive, such as in Fortran.

Definition

Recursive Function in C Language

A recursive function is a function that directly invokes itself in its own body.

return_type function(parameters) {
    /* ... */
    function(parameters);
    /* ... */
}

Now, let's try to analyze what happens if we invoke the recursive version in this way:

int main() {
    int n = 3;
    printf("The factorial of %d is %d\n", n, recursive_factorial(n));
    return 0;
}

In this example, we are invoking the function with a parameter 3:

Invocation of the factorial function with Argument equal to 3
Picture 1: Invocation of the factorial function with Argument equal to 3

Inside the function, the if statement will select the else branch since n is greater than 1. Therefore, the function will invoke itself again but this time with parameter 2. The call chain will become:

Call chain after the first recursive call
Picture 2: Call chain after the first recursive call

Similarly, the recursive_factorial(2) function will in turn call recursive_factorial(1):

Call chain after the second recursive call
Picture 3: Call chain after the second recursive call

When invoked with parameter 1, the function will enter the if branch and return 1. This branch takes the special name of base case and is that condition that stops the recursion:

Reaching the base case
Picture 4: Reaching the base case

Note that the calls to the recursive function at first stack one on top of the other, until reaching the base case. This first phase is called Pile Up.

Once the base case is reached, the calls are resolved one at a time, starting from the last call made.

Therefore, the result of the recursive_factorial(1) call, that is 1, is substituted in the recursive_factorial(2) call:

Factorial: Return of the first result
Picture 5: Factorial: Return of the first result

Subsequently, the result of recursive_factorial(2) is substituted in the recursive_factorial(3) call:

Factorial: Return of the second result
Picture 6: Factorial: Return of the second result

Finally, the result of recursive_factorial(3), that is 6, is returned to the main function:

Factorial: Return of final result
Picture 7: Factorial: Return of final result

Note that in this second phase we climbed back up the call chain, starting from the last call made. This second phase is called Unwind or Unwinding:

Factorial: Pile Up Phase and Unwind Phase
Picture 8: Factorial: Pile Up Phase and Unwind Phase
Definition

Pile Up and Unwind

A call to a recursive function is divided into two phases:

  • Pile Up: the function calls stack one on top of the other until reaching the base case.
  • Unwind: the calls are resolved one at a time, starting from the last call made.

All this is possible thanks to the presence of the Call Stack which allows storing function calls in an ordered manner. Each recursive call, in fact, creates a Stack Frame on the stack that contains the parameters and local variables of the function.

Of fundamental importance in a recursive function is the so-called base case. The base case is a condition that, if verified, stops the recursion. In the absence of a base case, the recursive function would continue to call itself infinitely, causing a stack overflow. In fact, each call to the recursive function consumes memory on the stack, and if there is no way to stop the recursion, the stack will fill up until it is no longer able to contain new calls.

Note

Never forget the base case in a recursive function

The base case of a recursive function is that condition that stops the recursion.

Its presence must never be overlooked, otherwise the recursive function could continue to call itself infinitely, causing a stack overflow.

This error is the recursive equivalent of an infinite loop.

A general scheme, therefore, for a recursive function is the following:

return_type function(parameters) {
    if (base_case) {
        /* Base case */
        return value;
    } else {
        /* General case */
        return function(parameters);
    }
}

Example: Power Calculation

Let's try to create a function that calculates the power between integers.

A first iterative version could be written keeping in mind that the power of a number with an integer exponent can be calculated by multiplying the number by itself as many times as the exponent:

b ^ e = \underbrace{b \cdot b \cdot b \cdot \ldots \cdot b}_{e \text{ times}}

So we can write the function in this way:

int iterative_power(int base, int exponent) {
    int power = 1;
    for (int i = 0; i < exponent; i++) {
        power *= base;
    }
    return power;
}

In this case, the iterative_power function calculates the power of an integer number base raised to another integer number exponent in an iterative way, multiplying base for exponent times using a for loop.

Wanting to realize the same function in a recursive way, we can redefine the power in terms of itself:

b ^ e = \left\{ \begin{array}{ll} 1 &amp; \text{if } e = 0 \\ b \cdot b^{e-1} &amp; \text{otherwise} \end{array} \right.

In this case, the recursive_power function calculates the power of an integer number base raised to another integer number exponent in a recursive way, multiplying base by the power of base with exponent exponent - 1.

int recursive_power(int base, int exponent) {
    if (exponent == 0) {
        /* Base case */
        return 1;
    } else {
        /* General case */
        return base * recursive_power(base, exponent - 1);
    }
}

Trying to call the recursive_power function with parameters 2 and 3, we would get the following call scheme:

Power: Recursive Call Chain
Picture 9: Power: Recursive Call Chain

Example: The Fibonacci Sequence

The Fibonacci Sequence is a particular numerical sequence of integers:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, \ldots

This sequence starts with 0 and 1 and has the peculiarity that each successive element is the sum of the two previous elements.

The Fibonacci sequence appears often in nature and in mathematics, and has been studied since ancient times. It describes, for example, the structure of a spiral. At infinity, the ratio between two successive elements tends to the constant value of:

\varphi = \frac{1 + \sqrt{5}}{2} \approx 1.618033988749895

Which is called the Golden Ratio.

The Fibonacci sequence can be defined recursively as follows:

F(n) = \left\{ \begin{array}{ll} 0 &amp; \text{if } n = 0 \\ 1 &amp; \text{if } n = 1 \\ F(n-1) + F(n-2) &amp; \text{otherwise} \end{array} \right.

Let's try to implement a C program that takes an integer number n as input and returns the n-th element of the Fibonacci sequence.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <stdio.h>

/* Function prototype */
unsigned long long int fibonacci(unsigned int n);

int main() {
    /* User input */
    unsigned int n;

    /* Request the number from the user */
    printf("Enter an integer number: ");
    scanf("%u", &n);

    /* Calculate and print the result */
    printf("The element %u of the Fibonacci sequence is %llu\n", n, fibonacci(n));
    return 0;
}

/* Function definition */
unsigned long long int fibonacci(unsigned int n) {
    if (n == 0 || n == 1) {
        return n;
    } else {
        return fibonacci(n - 1) + fibonacci(n - 2);
    }
}

Trying to compile and execute the program, some example outputs could be:

Enter an integer number: 0
The element 0 of the Fibonacci sequence is 0
Enter an integer number: 5
The element 5 of the Fibonacci sequence is 5
Enter an integer number: 10
The element 10 of the Fibonacci sequence is 55
Enter an integer number: 40
The element 40 of the Fibonacci sequence is 102334155

The program invokes the fibonacci function to obtain the result. The function, first checks the base case, then calculates the result as the sum of the two previous elements. To do this, it invokes itself twice.

Line 24, in fact, contains the following instruction:

return fibonacci(n - 1) + fibonacci(n - 2);

This instruction invokes the fibonacci function twice, once with parameter n - 1 and once with parameter n - 2. This is possible thanks to recursion.

Wanting to represent the call chain to the fibonacci function for n = 3, the Pile Up phase is the following:

Fibonacci: Pile Up Phase
Picture 10: Fibonacci: Pile Up Phase

Similarly to what we saw for the factorial function, the Unwind phase occurs when the base case is reached. In this case the bottom of the call chain is composed of multiple calls. Therefore the Unwind phase occurs as shown in the figure:

Fibonacci: Unwind Phase
Picture 11: Fibonacci: Unwind Phase

Evaluation Order of Recursive Functions

The function for calculating Fibonacci numbers presents the same problem that we faced when we studied pure functions, impure functions and side effects.

Let's take the fibonacci function. When we invoke the function with parameter n = 3:

fibonacci(3);

This invocation translates to:

return fibonacci(2) + fibonacci(1);

But in this case, which operand is evaluated first? fibonacci(2) or fibonacci(1)?

The C standard does not specify it. One might think that the call to fibonacci(1) is performed before fibonacci(2), assuming that the order is from left to right. However, the compiler to optimize the executable code might choose to invoke fibonacci(2) before fibonacci(1). Therefore one should not make assumptions about the evaluation order of operands.

In this case, evaluating fibonacci(1) or fibonacci(2) first does not change the final result, since fibonacci is in all respects a pure function. Therefore, the evaluation order of operands is not relevant.

There are cases, however, in which functions might, internally, modify global variables or have side effects. In these cases, the evaluation order of operands could be relevant.

In general, with recursive functions it is always advisable to avoid side effects and use only pure functions.

Exponential Complexity

A problem with recursive functions is that they can lead to performance and memory problems, if not used correctly.

Let's return to the case of the fibonacci function:

unsigned long long int fibonacci(unsigned int n) {
    if (n == 0 || n == 1) {
        return n;
    } else {
        return fibonacci(n - 1) + fibonacci(n - 2);
    }
}

Each level of recursion, in fact, involves two recursive calls. This means that the number of calls doubles at each level.

If we want to calculate the n-th number of the Fibonacci sequence, the number of calls to the fibonacci function will be 2^n.

This could quickly lead to situations where the number of calls explodes exponentially. For example, to calculate the 20th number of the Fibonacci sequence, the fibonacci function will be called 2^{20} = 1,048,576 times, just over one million times!. To calculate the 30th number, the function will be called 2^{30} = 1,073,741,824 times, more than one billion times!.

We can imagine the dramatic nature of this situation if we consider that each call to the fibonacci function consumes memory on the stack. If the number of calls grows exponentially, the stack will fill rapidly, causing a stack overflow.

In algorithm jargon, solutions of this type have an Exponential Complexity and should be avoided where possible.

In the case of the Fibonacci series, we can implement the function iteratively in this way:

unsigned long long int iterative_fibonacci(unsigned int n) {
    unsigned long long int a = 0, b = 1, c;
    if (n == 0) {
        return a;
    }
    for (unsigned int i = 2; i <= n; i++) {
        c = a + b;
        a = b;
        b = c;
    }
    return b;
}

In this case, the iterative_fibonacci function calculates the n-th number of the Fibonacci sequence iteratively, using two variables a and b to store the two previous numbers.

Note

Risk of Exponential Complexity

When using recursive functions one can easily run into cases where the Call Stack is overloaded. This happens when the number of calls to the function grows exponentially.

In such cases it is advisable to avoid recursion and adopt an iterative solution.

Recursion vs Iteration

Recursion is a very powerful and flexible concept, which allows solving complex problems in an elegant and concise way. However, recursion is not always the best solution. Often, an iterative solution, which makes use of while, for and do-while loops, is more efficient and faster than a recursive solution.

Let's compare the two approaches:

  • Both the iterative and recursive approaches make use of control flow statements:

    Iteration makes use of while, for and do-while loops to repeat a block of instructions a finite number of times.

    Recursion relies, instead, on an if statement to stop the recursion or continue it.

  • Both use mechanisms to repeat a block of instructions:

    Iteration repeats a block of instructions as long as a condition is true.

    Recursion repeats a block of instructions by calling itself.

  • Both use mechanisms to stop the repetition:

    Iteration stops the repetition when the loop condition is no longer true.

    Recursion stops the repetition when the base case is reached.

  • Both an iteration and a recursion can, in case of incorrect implementation, lead to infinite loops:

    A poorly implemented while or for loop might never terminate.

    A poorly implemented recursive function might call itself infinitely.

Recursion has various negative aspects. The main one is that it repeatedly uses the mechanism of invoking a function. Invoking a function has a cost both in terms of time and memory.

In fact, each recursive call consumes memory on the stack, since it must create a stack frame where the local variables and function parameters will be saved.

Iteration, conversely, does not have this problem. A for or while loop does not create stack frames and does not consume memory on the stack but always reuses the same variables. The cost of calling a function is not present.

Having said this, then why use recursion?

The truth is that any iterative algorithm can be implemented recursively and vice versa. The recursive approach is chosen when recursion reflects the mathematical nature of the problem itself, thus generating programs that are simpler to read and on which it is easier to debug.

Furthermore, as is evident from the case of the Fibonacci Series, an iterative solution is not always immediate. In simple words, recursion is, often, the simplest and most natural solution.

There is, however, a technique that combines the best of both approaches: Tail Recursion. We will see this technique in the next lessons.

In Summary

In this lesson we have seen that:

  • A function can call itself; this mechanism is called recursion;
  • a recursive function is a function defined in terms of itself;
  • recursion can lead to performance and memory problems, if not used correctly;
  • recursion can be replaced by an iteration and vice versa;
  • a recursive function call can be represented as a call chain and is divided into two phases: Pile Up and Unwind;
  • a recursive function can have exponential complexity while an iterative function does not;
  • recursion is a very powerful and flexible concept, but it is not always the best solution;

Furthermore, we have seen how to implement recursive functions for calculating the factorial, power and Fibonacci series.