Type Conversion in C

In the C programming language, type conversion is a fundamental concept that allows transforming a variable of one type into another. This process is essential to ensure that arithmetic operations and assignments between variables of different types occur correctly. Type conversions can be implicit, when the compiler executes them automatically, or explicit, when the programmer specifies them manually.

In this article, we will explore the implicit conversion rules in the C language, with particular attention to arithmetic conversions and conversions in assignment phase. Additionally, we will discuss the specific conversion rules introduced with the C99 standard. Understanding these rules is crucial for writing robust and efficient C code, avoiding common errors related to type handling.

Implicit Type Conversions

Computers have very strict rules regarding arithmetic operations.

When a processor must perform an arithmetic operation, the operands must be of the same size, therefore having the same number of bits, and must be stored (or rather encoded) in the same way.

A processor can, for example, directly add two signed 16-bit integers. But it cannot add in a direct manner an unsigned 16-bit integer with a signed 16-bit integer, or a 16-bit integer with a 32-bit integer. Or even worse, it cannot directly add a 32-bit integer with a 32-bit floating-point number.

In C language, however, we have seen that we can mix variables of different types in expressions. We can, in fact, combine integer variables, floating-point variables, and character type variables in a single expression. For example:

int integerA = 5;
float floatB = 3.14;
char characterC = 'A';

float floatD = integerA + floatB;
int integerE = integerA + characterC;

But how does the C compiler manage to handle these mixed expressions?

The answer is that the compiler, when it encounters an expression with operands of different types, inserts conversion instructions to convert the operands into a common format. In other words, without us having to indicate it explicitly, the C compiler generates conversion instructions in such a way that, subsequently, the processor can execute the arithmetic operation.

For example, if we try to add a 16-bit short integer with a 32-bit int integer, the C compiler will insert conversion instructions to convert the short integer into a 32-bit int integer. In this way, the processor will be able to add the two integers without problems.

Similarly, if we try to add an int and a float, the C compiler will insert conversion instructions to convert the integer into a floating-point number. In this case the conversion instructions will be much more complex since the representation of a floating-point number is very different from that of an integer. But the C compiler will take care of all this in a transparent manner for us.

Since the compiler handles these conversions without the programmer's involvement and in a completely transparent way, these conversions are called implicit conversions.

Definition

Implicit Type Conversion

In C language, implicit conversion is a data type conversion that occurs automatically by the compiler without the programmer having to explicitly request it.

There is also another type of conversion, called explicit conversion, which we will see in the next lesson. It is called like this because the programmer must explicitly indicate to the compiler that they want to convert one data type into another.

The implicit conversion rules are quite complex unfortunately since the C language has various numeric types. In this lesson we will see how they work.

Situations in which Implicit Conversions Occur

The implicit conversions performed by C are applied in these four situations:

When the operands of an arithmetic operation are not of the same type.

For example, when we add an integer and a floating-point number. In this case, the C compiler performs the so-called arithmetic conversions.
When the type of an expression on the right of an assignment is different from the type of the variable on the left of the assignment.

For example, when we assign a floating-point number to an integer variable. In this case, the C compiler performs the so-called conversions in assignment phase.
When we pass an argument to a function of a type different from the type of the function parameter.

For example, when we pass an integer to a function that accepts a floating-point number. In this case, the C compiler performs the so-called conversions in function call phase. We will study this situation in the next lessons when we go to study functions in C language.
When we return a value from a function of a type different from the return type of the function.

For example, when we return a floating-point number from a function that should return an integer. In this case, the C compiler performs the so-called conversions in function return phase. Also in this case, we will study this situation in the next lessons.

For the moment, we will focus on the first two situations, namely arithmetic conversions and conversions in assignment phase.

Arithmetic Conversions

Arithmetic conversions in C are applied, typically, when binary operators are used between operands of different type. This applies to all binary types, whether they are arithmetic, logical, or relational.

To understand the logic behind implicit arithmetic conversions, let's take an example:

int integerA = 5;
float floatB = 3.14;

integerA + floatB;

We have, in this example, the sum between an int integer and a float floating-point number. The C compiler faces the first situation seen above: the operands of an arithmetic operation are not of the same type.

What does the compiler do in this case?

There are two possible solutions:

Convert the float type into int, obtaining two integers and adding them together.

This is a possible solution but it is not the best one. In fact, if we convert the float into int two problems can occur:
- First of all, we would lose the decimal part of the floating-point number. In this case, the number 3.14 would be converted to 3, and therefore the result of the sum would be 5 + 3 = 8 obtaining a catastrophic loss of precision.
- Secondly, it is not certain that a float is representable with an integer number. Let's take for example the number 2.5e56. This is a number so large that it cannot be represented with a 32-bit integer. In this case, the conversion from float to int would be impossible.
Convert the int type into float, obtaining two floating-point numbers and adding them together.

This is the best solution. In fact, by converting the integer into a floating-point number, we do not lose any information. Moreover, an integer number can always be converted into float except for a slight loss of significant digits in case the integer is too large.

The C compiler, therefore, chooses the second solution and converts the int integer into a float floating-point number. In this way, the processor can add the two floating-point numbers without problems.

From this we can understand what the general strategy is for implicit conversion of arithmetic types in C:

Definition

Implicit Arithmetic Conversion Strategy

In C language, when a binary operator is applied to two operands of different type, the C compiler converts the operands to the smallest type capable of containing both of them.

In other words, the compiler chooses the type with the least number of bytes capable of representing both operands with the minimum loss of information.

In general, in most cases, implicit conversion translates into an extension to a similar type but which occupies more bytes. For example, adding a short to an int simply consists of extending the short by adding bytes on the left until reaching the size of an int. After all, both short and int are integer types and use, except for the number of bits, the same two's complement representation.

This type of conversion takes the name of Type Promotion:

Definition

Type Promotion

Type promotion is an implicit conversion that consists of extending a data type to a type that occupies more bytes but which uses the same representation.

Note that we talk about promotion when the types are similar, that is, both integers or both floating-point. We do not talk about promotion when we convert an integer into a floating-point number, or vice versa, since the two types are very different.

Moreover, in some cases promotion always occurs. For example, when an operand is applied to two short or two char, the latter are always first converted to int and then the operator is applied. This is because the processor works faster with 32-bit integers than with 16 or 8-bit integers. This type of promotion takes the name of Integral Promotion.

Definition

Integral Promotion

Integral promotion is an implicit conversion that consists of extending a data type to a 32-bit integer type.

It is always applied when using two char or short data types in an arithmetic expression.

Arithmetic Conversion Rules

Now let's analyze the arithmetic conversion rules applied to an expression composed of a binary operator and two operands of different type.

We can divide these rules into two groups:

When one or both operands are floating-point:

In this case the following scheme is used:

Picture 1: Implicit conversion rules for floating-point types

In other words, we reason like this:
1. If one operand is of type long double, the other operand is converted to long double.
2. Otherwise, if one operand is of type double, the other operand is converted to double.
3. Otherwise, if one operand is of type float, the other operand is converted to float.
These three rules also cover the mixed case. In fact, if one of the two operands is of type double but the other is of integer type, the compiler converts the integer to double and then adds the two floating-point numbers.
When both operands are integers:

In this case the following scheme is used:

Picture 2: Implicit conversion rules for integer types

First of all, the integral promotion seen above is applied, if possible. In this way we have the guarantee that neither of the two operands is of type char or short.

Subsequently, the smallest integer type capable of containing both operands is chosen.

Arithmetic Conversion and Unsigned Integer Types

When, in an arithmetic expression, signed integer types and unsigned types are combined, the rules become slightly complicated.

Let's take the following example:

unsigned int unsignedA = 5;
int integerB = 3;

unsignedA + integerB;

In this case, we have an unsigned integer unsigned int and a signed integer int. Both use the same number of bits. What conversion rules are applied in this case?

In this case something happens that, at first glance, may seem counter-intuitive. In fact, the C compiler converts the signed integer int into an unsigned integer unsigned int. In this way, the processor can add the two integers without problems.

This rule applies however only in case the two types use the same number of bits. In fact, if we add an unsigned int and a long int, the compiler converts the unsigned int into a long int and not vice versa since the unsigned int is smaller than the long int.

Definition

Arithmetic Conversion Rule for Unsigned Integer Types

When, in an arithmetic expression, a signed integer and an unsigned integer are combined, the C compiler converts the signed integer into an unsigned integer if the two types use the same number of bits.

In case the signed type has a greater number of bits than the unsigned type, the compiler converts the unsigned integer into the signed integer.

This conversion, however, can cause errors that are very difficult to detect.

In fact, if the signed integer is positive, the conversion does not cause problems. But if the signed integer is negative, the conversion into an unsigned integer can lead to unexpected results.

To clarify, let's take an example:

unsigned int unsignedA = 10;
int integerB = -5;

if (unsignedA > integerB) {
    printf("unsignedA is greater than integerB\n");
} else {
    printf("integerB is greater than unsignedA\n");
}

At first glance, if we observe the code written above we might think that the output of the program is always unsignedA is greater than integerB.

But this does not happen.

In fact, the compiler converts the signed integer integerB into an unsigned int. In doing this, however, what happens is that to integerB the value $2^n$ is added where $n$ is the number of bits.

This happens because, the last bit of integerB which is used to represent the sign, when converted into an unsigned integer, is interpreted as a bit of greater weight. In this way, the value of integerB is interpreted as $2^n + integerB$ , where $n$ is the number of bits of the integer.

If an int is represented with 32 bits, then what happens is that in the conversion integerB becomes:

integerB = -10 + 2^{32} = 4294967286

Therefore, in the comparison between unsignedA and integerB, integerB will be greater than unsignedA!

Note

Avoid combining signed and unsigned types in an arithmetic expression

The advice we give is to avoid combining signed and unsigned types in an arithmetic expression. This is because implicit conversions can lead to unexpected results.

Some compilers signal this situation as a warning. For example, the gcc compiler signals:

warning: comparison between signed and unsigned

If you receive a warning of this type, it is good to review the code and try to avoid combining signed and unsigned types.

Summary Example of Arithmetic Conversion

Let's summarize the arithmetic conversion rules with the following summary example:

char characterC;
short int shortS;
int integerI;
unsigned int unsignedU;
long int longL;
unsigned long int unsignedLongUL;
float floatF;
double doubleD;
long double longDoubleLD;

/* ... */

integerI = integerI + characterC;          /* characterC is converted to int */
integerI = integerI + shortS;              /* shortS is converted to int */
unsignedU = unsignedU + integerI;          /* integerI is converted to unsigned int */
longL = longL + unsignedU;                 /* unsignedU is converted to long int */
unsignedLongUL = unsignedLongUL + longL;   /* longL is converted to unsigned long int */
floatF = floatF + unsignedLongUL;          /* unsignedLongUL is converted to float */
doubleD = doubleD + floatF;                /* floatF is converted to double */
longDoubleLD = longDoubleLD + doubleD;     /* doubleD is converted to long double */

In this example, we have a series of variables of different types. In each line, we have an assignment of an arithmetic expression to a variable. In each arithmetic expression, the C compiler converts the operands in order to be able to execute the arithmetic operation.

Conversions in Assignment Phase

The rules seen above for implicitly converting arithmetic types in an expression do not apply during assignment.

The C language, instead, follows the following simple rule:

Definition

Conversion Rule in Assignment Phase

In an assignment in C language, the expression on the right of the assignment operator is converted to the type of the variable on the left of the assignment.

type variableX = expression

The expression is converted to the type type before being assigned to the variable variableX.

At this point, two cases can occur:

The variable is of a type capable of containing the result:

In this case the conversion occurs without problems. For example:
```
char characterC;
int integerI;
float floatF;
double doubleD;

integerI = characterC;      /* characterC is converted to int */
floatF = integerI;          /* integerI is converted to float */
doubleD = floatF;           /* floatF is converted to double */
```
In the example above, in each assignment the destination variable is always larger than the result. That is, int is larger than char, float is larger than int and double is larger than float. Therefore the implicit conversions have no repercussions.
The variable is not capable of containing the result:

Let's take an example where we assign a floating-point value to an integer:
```
int integerI = 3.14;
```
In this case, the C compiler converts the floating-point number 3.14 into an integer removing the fractional part. Therefore, the variable integerI will be worth 3.

Note that this is not a rounding. It is a pure and simple truncation. Therefore:
```
int integerI = 8.9999;
```
The variable integerI will be worth 8 and will not be rounded to 9.

Moreover, situations can occur in which the assignment is meaningless or, in the worst case, the result is outside the range representable by that type.

For example:
```
char characterC = 10000;
```
In this case, the number 10000 is too large to be represented with a char which is capable of representing numbers from 0 to 255. The result therefore, will be a truncation of significant digits and the variable characterC will be worth 16.

This is because in binary the number $10000$ is:

$10000_{10} = 10011100010000_2$

But the char type can only contain the first 8 least significant binary digits:

$10000_{10} = 100111\mathbf{00010000}_2 \rightarrow 00010000_{2} = 16_{10}$

We have obtained a completely different number!

Implicit Conversion Rules in C99

The implicit conversion rules become slightly complicated in the C99 standard.

The reason lies in the fact that C99 introduces both the boolean type _Bool and the new integer type long long. Moreover it also adds the complex type which we will see later.

To be able to define the conversion rules, in C99 the so-called integer type conversion hierarchy was introduced. This hierarchy defines the conversion order between integer types and is defined as follows from the highest to the lowest level:

long long int and unsigned long long int
long int and unsigned long int
int and unsigned int
short int and unsigned short int
char, signed char and unsigned char
_Bool

The Integral Promotion rule is thus redefined in C99:

Definition

Integral Promotion in C99

All integer types that are in the integer type conversion hierarchy below the level of int and unsigned int are automatically promoted to:

int if the value can be represented with an int;
unsigned int otherwise.

Having said this, we can now define the implicit conversion rules in C99:

If one of the two operands is of a floating-point type the same rules seen above apply:
- If one of the two operands is of type long double, the other operand is converted to long double.
- Otherwise, if one of the two operands is of type double, the other operand is converted to double.
- Otherwise, if one of the two operands is of type float, the other operand is converted to float.
If both operands are of integer type the following rules are applied:

First, integral promotion is performed if necessary.

If the types of the two operands are, after the first step, equal the conversion process ends.

Otherwise, the following rules are used, stopping at the first one that has a match:
1. If one operand is signed while the other is unsigned, the operand with the lower type in the hierarchy is converted to the type of the operand with the higher type.
2. If the unsigned operand has a higher type in the hierarchy or is at the same level as the signed operand, the signed operand is converted to the type of the unsigned operand.
3. If the type of the signed operand can represent all the values of the type of the unsigned operand, the unsigned operand is converted to the type of the signed operand.
4. Otherwise, both operands are converted to the unsigned type corresponding to the same level of the hierarchy of the signed type.

In Summary

In this lesson we have seen how the C compiler handles implicit type conversions.

We have seen that:

The C compiler inserts conversion instructions to convert operands into a common format when it encounters an expression with operands of different types.
These conversions are called implicit conversions.
There are four situations in which implicit conversions occur: arithmetic conversions, conversions in assignment phase, conversions in function call phase and conversions in function return phase.
Arithmetic conversions are applied when binary operators are used between operands of different type.
The implicit arithmetic conversion strategy in C consists of converting the operands to the smallest type capable of containing both of them.
Type promotion is an implicit conversion that consists of extending a data type to a type that occupies more bytes but which uses the same representation.
Integral promotion is an implicit conversion that consists of extending a data type to a 32-bit integer type.
When signed and unsigned types are combined in an arithmetic expression, the C compiler converts the signed integer into an unsigned integer if the two types use the same number of bits.
The conversion rules in assignment phase in C provide that the expression on the right of the assignment operator is converted to the type of the variable on the left of the assignment.
In C99, the implicit conversion rules become slightly complicated due to the introduction of the boolean type _Bool and the integer type long long.

In the next lesson we will see how to perform explicit type conversions in C language: this type of conversion takes the name of casting.