Accessing Individual Bits in C

When working at low level with the C language, it often happens that you need to directly manipulate the individual bits of a variable.

This ability is essential in fields such as graphics (where multiple pixels can be stored in a single byte) or in general when you want to optimize space and performance.

In this lesson we will see how to use bitwise operators, introduced in the previous lesson, to set, clear and test individual bits or bit-fields, concluding with a practical example of encryption via XOR.

Accessing Individual Bits

When we do low-level programming, we often need to store data as individual bits or sets of bits. For example, in graphics programming we may want to pack two (or more) pixels into a single byte. Using bitwise operators, we can extract or modify the data stored in a few bits.

Suppose i is an unsigned short variable with 16 bits. Let's see how to perform some common operations on a single bit of i.

In general, to perform such operations we must prepare so-called bit masks. A bit mask is a binary number that has a 1 in a specific position and 0 elsewhere. Using masks, we can set, clear or test the bits of i.

Bit Setting

Let's say we want to set bit 4 of i. We assume that the leftmost bit, called MSB Most Significant Bit, is numbered 15, while the least significant is 0.

The simplest way to set bit 4 is to perform an OR operation with the value 0x0010, a so-called bit mask or bit-mask that contains a 1 in position 4:

i = 0x0000;    /* i is now 0000000000000000 in binary */
i |= 0x0010;   /* i is now 0000000000010000 in binary */

More generally, if the position of the bit to set is stored in the variable j, we can use the shift operator to create the mask:

i |= (1 << j); /* sets bit j */

Definition

Bit Mask to Set a Bit

The operations needed to set a bit j in a variable i are:

Create a bit mask with a 1 in position j and 0 elsewhere:
```
unsigned int SET_MASK = 1 << j;
```
Use the bitwise OR operator to set bit j:
```
i |= SET_MASK;
```

Bit Clearing

To clear (set to 0) bit 4 of i, we use a mask that has 0 in position 4 and 1 elsewhere:

i = 0x00FF;    /* i is now 0000000011111111 in binary */
i &= ~0x0010;  /* i is now 0000000011101111 in binary */

Following the same idea, we can write a statement that clears the bit at position j:

i &= ~(1 << j); /* clears bit j */

Definition

Bit Mask to Clear a Bit

To clear a bit j in a variable i, we must:

Create a bit mask with a 0 in position j and 1 elsewhere:
```
unsigned int CLEAR_MASK = ~(1 << j);
```
Use the bitwise AND operator to clear bit j:
```
i &= CLEAR_MASK;
```

Bit Testing

The following if statement checks if bit 4 of i is set:

if (i & 0x0010) {
    /* bit 4 of i is set */
}

To test if bit j is set, we will use:

if (i & (1 << j)) {
    /* bit j of i is set */
}

Definition

Bit Mask to Test a Bit

To test if a bit j is set in a variable i, we must:

Create a bit mask with a 1 in position j and 0 elsewhere:
```
unsigned int TEST_MASK = 1 << j;
```

Use the bitwise AND operator to test bit j:

if (i & TEST_MASK) {
    /* bit j of i is set */
}

Often, to work with bits, we assign them names: for example, we say that bits 0, 1 and 2 of a number correspond to the colors blue, green and red. Therefore:

#define BLUE  1  /* bit 0 */
#define GREEN 2  /* bit 1 */
#define RED   4  /* bit 2 */

Setting, clearing and testing BLUE is done like this:

i |= BLUE;    /* sets the BLUE bit */
i &= ~BLUE;   /* clears the BLUE bit */
if (i & BLUE) /* tests the BLUE bit */

It is also simple to set, clear or test multiple bits simultaneously:

i = BLUE | GREEN;  /* sets the BLUE and GREEN bits */
i &= ~(BLUE | GREEN); /* clears the BLUE and GREEN bits */
if (i & (BLUE | GREEN)) /* tests if BLUE or GREEN is set */

Using Bitwise Operators to Access Bit-Fields

Managing a group of consecutive bits (a bit-field) is slightly more complicated than working with individual bits. Here are two common operations:

Modifying a Bit-Field

To modify a block of bits in i, we usually apply a mask with AND (to clear the bits), then an OR operation (to store the new bits).
In the following example, we want to write the binary value 101 into positions 4-6 (starting from 0) of i:

i = i & ~0x0070 | 0x0050; /* stores 101 in bits 4-6 */

& ~0x0070 clears bits 4,5,6 of i.
| 0x0050 sets bits 4,6 leaving bit 5 unchanged.

In a more general case, if j contains the value to store in bits 4-6 of i, we must first shift j (for example j << 4) and then use it with the mask:

i = (i & ~0x0070) | (j << 4);

Definition

Bit Mask to Modify a Bit-Field

To modify a block of consecutive bits in a variable i, we must:

Create a bit mask to clear the bits of the bit-field:
```
unsigned int CLEAR_MASK = ~0x0070;
```
Create a bit mask to set the new bits of the bit-field:
```
unsigned int SET_MASK = j << 4;
```
Use the bitwise AND operator to clear the bits of the bit-field:
```
i = i & CLEAR_MASK;
```
Use the bitwise OR operator to set the new bits of the bit-field:
```
i = i | SET_MASK;
```

Retrieving a Bit-Field

If the bit-field is at the right edge of a number (in the least significant bits), retrieval is trivial. For example, to extract bits 0-2 in i we can write:

j = i & 0x0007; /* extracts bits 0-2 */

If instead the bit-field is not at the right edge of i, we can first shift it to the right and then apply the mask. To extract bits 4-6 of i, we do:

j = (i >> 4) & 0x0007; /* extracts bits 4-6 */

Definition

Bit Mask to Retrieve a Bit-Field

To retrieve a block of consecutive bits in a variable i, we must:

Shift the bits of the bit-field to the right to position them at the right edge:
```
j = i >> 4;
```
Create a bit mask to extract the bit-field:
```
unsigned int EXTRACT_MASK = 0x0007;
```
Use the bitwise AND operator to extract the bit-field:
```
j = j & EXTRACT_MASK;
```

Example: XOR Encryption

One of the simplest ways to encrypt data is to perform an exclusive-or (XOR) operation between each character and a secret key.

Suppose the key is the character &. If we XOR & with the character z, we will get \ (assuming the use of the ASCII character set):

    00100110  (ASCII code for &)
XOR 01111010  (ASCII for z)
    --------
    01011100  (ASCII for \)

To decrypt a message, the same algorithm is applied. Encrypting a second time recovers the original text. For example, XOR of & with \ gives back z.

The following program, xor.c, encrypts a message by performing the XOR operation on each character with the key &.

The original message can be entered by the user. The encrypted message can be displayed on screen or saved to a file (output redirection). To recover the original message, just run the xor program again on the encrypted file.

Here is the source code of xor.c:

/* Performs XOR encryption */
#include <ctype.h>
#include <stdio.h>

#define KEY '&'

int main(void)
{
    int orig_char, new_char;

    while ((orig_char = getchar()) != EOF) {
        new_char = orig_char ^ KEY;
        if (isprint(orig_char) && isprint(new_char))
            putchar(new_char);
        else
            putchar(orig_char);
    }
    return 0;
}

As you can see, the program is very simple. Performing XOR of a character with & can give rise to invisible control characters for some operating systems. To avoid problems in reading and writing these characters, the isprint function checks if the characters involved are printable. If one of the two is not printable, the program keeps the original character.

In Summary

The key concepts seen in this lesson are:

Setting, clearing and testing bits: using OR (|), AND (&) and complement (~) with masks, we can individually control the bits in a variable.
Bit-Field: to manage blocks of consecutive bits, it is necessary to combine shifts (<<, >>) and masks in AND/OR to correctly overwrite and read the data.
XOR Encryption: a concrete example of using bitwise operators is XOR encryption, which allows encrypting and decrypting a text by applying the same algorithm.