Integer Constants in C

In this article we will explore the representation of integer numeric constants in the C programming language.

We will discuss the different bases in which these constants can be written: decimal, octal, hexadecimal and binary. Additionally, we will delve into the rules for determining the type of an integer constant depending on its numeric base.

Integer literal numeric values in C

In programs written in C, the need often arises to insert numeric constants within the source code.

For example, supposing we want to create an if statement that compares a variable with a numeric value, it is possible to write the following code:

int number;

printf("Insert a number: ");
scanf("%d", &number);

if (number >= 100) {
    printf("You inserted a value greater than 100\n");
}

In this case, the value 100 is an integer numeric constant that is compared with the value inserted by the user. Such constants also take the name of literal numeric values.

In C language, literal numeric values can be written in different numeric bases:

  • Decimal: the most common format, in which numbers are written with digits from 0 to 9.

  • Octal: numbers are written only with digits from 0 to 7.

    Each digit of an octal number represents a power of the number 8, in the same way that each digit of a decimal number represents a power of the number 10.

    So, for example, the octal number 142 represents the number:

    1 \cdot 8^2 + 4 \cdot 8^1 + 2 \cdot 8^0 =
    1 \cdot 64 + 4 \cdot 8 + 2 \cdot 1 = 64 + 32 + 2 = 98
  • Hexadecimal: numbers are written with digits from 0 to 9 and letters from A to F.

    The letters from A to F indicate, respectively, the numbers from 10 to 15. Also in this case, each digit of a hexadecimal number represents a power of the number 16.

    For example, the hexadecimal number AB3 represents the number:

    A \cdot 16^2 + B \cdot 16^1 + 3 \cdot 16^0 =
    10 \cdot 256 + 11 \cdot 16 + 3 \cdot 1 = 2560 + 176 + 3 = 2739
  • Binary: numbers are written only with digits 0 and 1.

    Also in this case, each digit of a binary number represents a power of the number 2.

    For example, the binary number 1011 represents the number:

    1 \cdot 2^3 + 0 \cdot 2^2 + 1 \cdot 2^1 + 1 \cdot 2^0 =
    1 \cdot 8 + 0 \cdot 4 + 1 \cdot 2 + 1 \cdot 1 = 8 + 0 + 2 + 1 = 11

In C language, to be able to write a numeric constant in a base different from decimal, it is necessary to prepend a prefix that indicates the numeric base used. This is because the compiler must be able to recognize the numeric base of the constant.

The prefixes are as follows:

  • Octal numbers: the prefix is 0 (zero).

    For example, the octal number 142 is written in C as 0142. Below, some examples of octal numbers:

    • 012 (10 in decimal)
    • 034 (28 in decimal)
    • 077 (63 in decimal)
  • Hexadecimal numbers: the prefix is 0x or 0X.

    For example, the hexadecimal number AB3 is written in C as 0xAB3. Below, some examples of hexadecimal numbers:

    • 0x0A (10 in decimal)
    • 0x1F (31 in decimal)
    • 0x7F (127 in decimal)

    An important detail regarding hexadecimal constants is that the letters can be written in both uppercase and lowercase.

    For example, the following values are equivalent:

    • 0xAB3 is equal to 0xab3
    • 0x1F is equal to 0x1f

    Moreover, uppercase and lowercase can be mixed within the same value:

    • 0xDaF is equal to 0xdAf
  • Binary numbers: the prefix is 0b or 0B.

    For example, the binary number 1011 is written in C as 0b1011. Below, some examples of binary numbers:

    • 0b0001 (1 in decimal)
    • 0b1010 (10 in decimal)
    • 0b1111 (15 in decimal)

Summarizing:

Definition

Integer numeric constants in C

Integer numeric constants in C can be written in different numeric bases by prepending a prefix that indicates the base used:

  • Decimal: without prefix and digits from 0 to 9;
  • Octal: prefix 0 and digits from 0 to 7;
  • Hexadecimal: prefix 0x or 0X and digits from 0 to 9 and letters from A to F;
  • Binary: prefix 0b or 0B and digits 0 and 1.

It must be kept in mind that writing numeric constants in octal, hexadecimal and binary base does not entail any difference for the compiler and the processor. Numbers are still represented (and therefore converted) in binary base to be processed.

We can switch from one notation to another without problems in our programs. In fact, we can mix numeric constants written in different bases within the same expression.

We can, for example, write the following code, perfectly valid:

int result;

result = 23 + 0x1F + 0b1011 + 012;

In this case, the value of the variable result will be equal to 23 + 31 + 11 + 10 = 75.

Using octal, hexadecimal and binary notations is very convenient when writing programs that manipulate low-level data. For example, when writing programs that interface directly with hardware or with the operating system.

For the moment we will omit these notations, focusing on numeric constants written in decimal base. We will use octal, hexadecimal and binary notations later, when we tackle more advanced topics.

Type of numeric constants

We saw in the previous lesson that the C language provides a series of integer data types, such as int, short, long, long long, which differ in the amount of memory they occupy, in the fact that they can be signed or unsigned, and in the range of values they can represent.

What we now ask ourselves is: what is the type of an integer numeric constant written in a C program?

The C standard and, consequently, compilers follow a series of rules to determine the type of an integer numeric constant. These rules differ based on the numeric base in which the constant is written.

If the numeric constants are written in decimal base the rules are as follows:

  • By default the constant is of type int;
  • If its value is too large to be represented with an int, then it is of type long int;
  • If its value is too large to be represented with a long int, then it is of type unsigned long int which represents the last choice.

So the selection sequence for a decimal constant is as follows:

\texttt{int} \rightarrow \texttt{long int} \rightarrow \texttt{unsigned long int}

In case, instead, the constant is written in binary, octal or hexadecimal base, the rules are slightly different:

  • By default the constant is of type int;
  • If its value is too large to be represented with an int, then it is of type unsigned int;
  • Similarly, if its value is too large to be represented with an unsigned int, then it is of type long int;
  • Finally, if its value is too large to be represented with a long int, then it is of type unsigned long int.

So the selection sequence for a binary, octal or hexadecimal constant is as follows:

\begin{array}{lclc} \texttt{int} && \rightarrow && \texttt{unsigned int} && \rightarrow \\ \texttt{long int} && \rightarrow && \texttt{unsigned long int} \end{array}
Definition

Rules for the type of integer numeric constants in C

When the C compiler encounters an integer numeric constant, it determines its type based on the numeric base in which it is written.

The rules start from a base type and, if the value of the constant is too large to be represented with that type, they move to the next type.

These rules constitute a selection sequence that determines the type of the numeric constant.

If the numeric base is decimal, the selection sequence is:

\texttt{int} \rightarrow \texttt{long int} \rightarrow \texttt{unsigned long int}

If the numeric base is binary, octal or hexadecimal, the selection sequence is:

\begin{array}{lclc} \texttt{int} && \rightarrow && \texttt{unsigned int} && \rightarrow \\ \texttt{long int} && \rightarrow && \texttt{unsigned long int} \end{array}

Forcing the type of a numeric constant

Rather than relying on the compiler, in C it is possible to force the type of a numeric constant by adding a suffix to the numeric value.

The available suffixes are as follows:

  • u or U: forces the type of the constant to unsigned int or unsigned long int;
  • l or L: forces the type of the constant to long int or unsigned long int;

These suffixes can be combined with each other to obtain the desired type.

So, for example, we can write the following numeric constants:

  • 123u: constant of type unsigned int;
  • 234U: constant of type unsigned int;
  • 123l: constant of type long int;
  • 234L: constant of type long int;
  • 123ul: constant of type unsigned long int.

We can also combine the suffixes U and L with constants written in bases different from decimal.

For example, we can write the following numeric constants:

  • 0x1Ful: hexadecimal constant of type unsigned long int;
  • 0b1011u: binary constant of type unsigned int.

When using these suffixes, the order in which they are written does not matter. For example, the following constants are equivalent:

  • 123ul is equal to 123lu;
  • 0x1FUL is equal to 0x1FLU.
Definition

Forcing the type of a numeric constant in C

In C language, it is possible to force the type of a numeric constant by appending a suffix to the value:

  • u or U: forces the type to unsigned;
  • l or L: forces the type to long.

These suffixes can be combined.

Rules for the type of numeric constants in C99

In C99, as we have seen, the additional type long long and unsigned long long was introduced.

For this reason, the selection sequences of the type of an integer numeric constant are slightly modified.

In the case of a constant written in decimal base, the selection sequence is as follows:

\texttt{int} \rightarrow \texttt{long int} \rightarrow \texttt{long long int}

While, in the case of a constant written in binary, octal or hexadecimal base, the selection sequence is as follows:

\begin{array}{lclc} \texttt{int} && \rightarrow && \texttt{unsigned int} && \rightarrow \\ \texttt{long int} && \rightarrow && \texttt{unsigned long int} && \rightarrow \\ \texttt{long long int} && \rightarrow && \texttt{unsigned long long int} \end{array}

Suffixes LL and ULL in C99

In addition to the suffixes described above for forcing the type of a numeric constant, C99 introduces two new suffixes:

  • LL or ll: forces the type of the constant to long long int;
  • ULL or ull: forces the type of the constant to unsigned long long int.

So, for example, we can write the following numeric constants:

  • 123LL: constant of type long long int;
  • 234ULL: constant of type unsigned long long int.

In Summary

This lesson has shown us how to represent integer literal values, or integer constants, in C language.

We have seen that these constants can be written in different numeric bases: decimal, octal, hexadecimal and binary. We have learned to write numeric constants in bases different from decimal, using the prefixes 0, 0x and 0b.

We have also discussed the rules that the C compiler follows to determine the type of a numeric constant, depending on the numeric base in which it is written. We have seen that, in C, it is possible to force the type of a numeric constant by appending suffixes to the value.

These suffixes are:

  • u or U: to force the type to unsigned;
  • l or L: to force the type to long;
  • LL or ll: to force the type to long long;
  • ULL or ull: to force the type to unsigned long long.

In the next lesson we will study how to read and write integer values in C, using the scanf and printf functions. In particular we will see how to read and write the various types of integer data, such as int, short, long and long long.