The Standard Mathematical Library in C

Key Takeaways

The standard mathematical library in C language is accessible by including the header <math.h>.
The library provides a wide range of common mathematical functions, such as powers, square roots, trigonometric functions, logarithms and more.
Many compilers require explicitly linking the mathematical library during the linking phase, using the -lm option.
The standard mathematical library follows the IEEE 754 standard for floating-point number representation, which defines concepts such as positive/negative zero, denormalized numbers, special values (NaN, infinity) and rounding modes.
The standard mathematical library handles mathematical errors such as overflow, underflow, division by zero and undefined operations, often using the global variable errno to signal such errors.

Use of the standard mathematical library

The C language standard defines a standard mathematical library, accessible by including the header <math.h>, which provides a series of common mathematical functions, such as the calculation of powers, square roots, trigonometric functions, logarithms and more.

To be able to use it, it is necessary to include the header <math.h> at the beginning of the program. Furthermore, many compilers require explicitly linking the mathematical library during the linking phase, using the -lm option (for example, with GCC: gcc -o program program.c -lm).

Some details on the IEEE 754 or IEC 60559 standard

We have seen previously that, in many implementations of the C language, floating-point data types (float and double) follow the IEEE 754 standard (also known as IEC 60559), which defines the representation and behavior of floating-point numbers.

This standard specifies, precisely, how to represent such numbers: using 32 bits for the float type (single precision) and 64 bits for the double type (double precision).

Numbers are stored in binary scientific notation, which consists of a sign, an exponent and a mantissa (or fraction).

To fully understand some behaviors of mathematical functions in C, it is useful to know some key concepts of the IEEE 754 standard:

Positive Zero and Negative Zero:

The IEEE 754 standard represents the sign of floating-point numbers with a dedicated bit. The consequence is that, according to the standard, there are two distinct representations of zero: positive zero (+0) and negative zero (−0).

The fact that zero has two representations can influence the behavior of some mathematical operations, such as division by zero or the calculation of functions involving zero.
Denormalized or subnormal numbers:

When an operation is performed that produces a very small result, below the minimum representable for the floating-point data type, there is a condition called underflow.

To better understand this concept, consider the case in which, using a calculator, we repeatedly divide a number by 10. At a certain point, the result will become so small that the calculator will no longer be able to represent it correctly, and will simply display "0".

This phenomenon can have major repercussions in numerical calculations, since the results of some operations could be approximated to zero, leading to significant errors in subsequent calculations.

To avoid this problem, the IEEE 754 standard defines denormalized numbers (or subnormals), which allow representing very small numbers, albeit with reduced precision. In fact, normally floating-point numbers are represented with a normalized mantissa, that is, it means that the mantissa has an implicit bit that is always 1. In denormalized numbers, instead, this implicit bit is not present, allowing the representation of smaller numbers, but with less precision as the number becomes smaller.
Special values:

The IEEE 754 standard also defines some special values to represent particular situations in floating-point calculations:
- Not a Number (NaN): Represents a non-numeric value, which can result from undefined mathematical operations, such as 0/0 or the square root of a negative number.
- Positive Infinity and Negative Infinity: Represent values that exceed the maximum representable for the floating-point data type. For example, the result of 1.0 / 0.0 is positive infinity. While -1.0 / 0.0 is negative infinity.
Such special values are considered valid numbers in all respects and can be used in mathematical operations, following specific rules. For example:
- Dividing a finite number by infinity always returns zero.
- Any arithmetic operation involving NaN always returns NaN.
Rounding Direction:

When a result cannot be exactly represented with a floating-point number (for example, when you want to represent an irrational number like $\pi$ or $\sqrt{2}$ ), it will be represented with an approximation that depends on the rounding direction or rounding mode.

There are four rounding modes defined by the IEEE 754 standard:
1. Round to nearest: the value is rounded to the nearest representable number. In case of equidistance, the number with even fractional part is chosen (round half to even).
2. Round toward zero: the value is rounded toward the representable number closest to zero.
3. Round toward positive infinity or round toward +∞: the value is rounded toward the largest representable number.
4. Round toward negative infinity or round toward -∞: the value is rounded toward the smallest representable number.
Typically, the default rounding mode is round to nearest.
Handling of mathematical errors:

The IEEE 754 standard defines five types of errors that can occur during floating-point operations:
1. Overflow: Occurs when the result of an operation is too large to be represented in the floating-point data type.
2. Underflow: Occurs when the result of an operation is too small to be represented in the floating-point data type.
3. Division by zero: Occurs when attempting to divide a finite number by zero.
4. Undefined operation: Occurs when performing an undefined mathematical operation, such as the square root of a negative number.
5. Inexact: Occurs when the result of an operation cannot be exactly represented and must be rounded.
The C standard mathematical library signals these errors in different ways, which we will see later in this lesson.

Types and Macros

The standard mathematical library defines some data types and macros useful for working with floating-point numbers:

Data types:
- float_t: A floating-point data type that represents the data type with the highest precision between float and double supported by the implementation.
- double_t: A floating-point data type that represents the data type with the highest precision between double and long double supported by the implementation.
Macros:
- INFINITY: Represents a positive infinity value for the float type.
- NAN: Represents a "Not a Number" (NaN) value for the float type.

Error handling in the mathematical library

Functions defined in the standard mathematical library handle errors in a different way compared to other library functions.

When a mathematical error occurs (for example, calculating the square root of a negative number), these functions set the global variable errno to a specific value that indicates the type of error that occurred, instead of returning an error code directly.

Furthermore, when the return value of a function is too large to be represented in the expected data type, a special value HUGE_VAL, HUGE_VALF or HUGE_VALL is returned, depending on the data type (respectively double, float or long double), and errno is set to ERANGE to indicate an overflow error.

These macros are of type double, float and long double and represent a very large numeric value indicating that the result is outside the representable range. Indeed, often these values represent positive infinity according to the IEEE 754 standard for floating-point number representation.

Mathematical functions in the <math.h> library detect two main types of errors:

Domain errors: Occur when the input provided to a function is outside the valid domain for that function. For example, calculating the square root of a negative number, or the logarithm of a non-positive number.

In these cases, errno is set to EDOM. In some implementations, furthermore, the function returns NAN (Not a Number) to indicate that the result is not defined.
Range errors: Occur when the result of a function is too large (overflow) or too small (underflow) to be represented in the expected data type.

In these cases, errno is set to ERANGE, and the function returns HUGE_VAL, HUGE_VALF or HUGE_VALL to indicate an overflow, or 0 to indicate an underflow.

Alternative mechanisms for signaling mathematical errors

Starting from the C99 standard, the standard mathematical library offers alternative mechanisms for error signaling, using appropriate macros.

In general, an implementation of the mathematical library can signal errors in one of these three modes:

Classic mode: functions set errno in case of error.
System exception mode: functions generate hardware exceptions in case of error (for example, division by zero). Such exceptions are then handled by the operating system or the execution environment.
Hybrid mode: functions use both errno and hardware exceptions to signal errors.

To understand which mode is currently used by the library, a program can use the macro math_errhandling, defined in the header <math.h>. This macro can assume the following values:

MATH_ERRNO: indicates that the library uses the classic mechanism based on errno.
MATH_ERREXCEPT: indicates that the library uses the mechanism based on hardware exceptions.
A bitwise OR combination of both previous values, indicating that the library uses a hybrid mechanism.

The value of math_errhandling is determined at compile time and depends on the implementation of the standard mathematical library provided with the compiler. It cannot be modified at runtime.

To verify the value of math_errhandling, we can write a simple program like the following:

#include <stdio.h>
#include <math.h>

int main() {
    if (math_errhandling & MATH_ERRNO)
        printf("The mathematical library uses errno for error signaling.\n");

    if (math_errhandling & MATH_ERREXCEPT)
        printf("The mathematical library uses hardware exceptions for error signaling.\n");

    return 0;
}

In the above example, we used the bitwise AND operator (&) to verify if each of the two error signaling mechanisms is active in the mathematical library in use.

To understand how these signaling mechanisms work in the C99 standard, let's see how they behave in case of overflow and underflow.

In case of overflow, mathematical functions behave as follows:
1. If the result is an exact infinity, such as when calculating the logarithm of zero $\log(0)$ , the function returns the value HUGE_VAL, HUGE_VALF or HUGE_VALL (depending on the data type of the return value). Furthermore, the result will have the appropriate sign (positive or negative) based on the calculation performed.
2. If the value of math_errhandling includes MATH_ERRNO, the function may set errno to ERANGE to indicate overflow.
3. If the value of math_errhandling includes MATH_ERREXCEPT, the function may generate a hardware overflow exception. Specifically, a division by zero exception is raised if the result is an exact infinity, otherwise a floating-point overflow exception is raised.
If the result of a mathematical function is too small to be represented (underflow), functions behave as follows:
1. The function returns a value whose magnitude is less than the minimum representable for the data type of the return value. In practice, this value is rounded to zero.
2. If the value of math_errhandling includes MATH_ERRNO, the function may set errno to ERANGE to indicate underflow.
3. If the value of math_errhandling includes MATH_ERREXCEPT, the function may generate a floating-point underflow hardware exception.

Note that in the description of the behaviors reported above, the words may set and may generate indicate that the implementation of the standard mathematical library can choose whether or not to adopt such behaviors based on the platform and compiler in use.