Character Input and Output in C

The C language standard library provides a series of functions to perform character input and output operations on files. When we talk about character I/O, we are actually referring to byte by byte read and write operations, where each character, char, is represented by a single byte.

Therefore, character I/O is particularly suitable for manipulating any type of file, including text files and binary files, as it allows reading and writing data in a precise and controlled manner.

Key Takeaways
  • The C language standard library provides functions to perform character input and output operations on files.
  • The main character writing functions are fputc(), putc(), and putchar().
  • The main character reading functions are fgetc(), getc(), and getchar().
  • The ungetc() function allows returning a previously read character to the stream.
  • Character I/O functions work with the int type to correctly handle the special EOF value.
  • Working with characters allows manipulating any type of file, including binary files, as it allows reading and writing exact bytes.

Character Writing Functions

The C language standard library provides several functions for writing characters to files. The main character writing functions are:

int fputc(int c, FILE *stream);
int putc(int c, FILE *stream);
int putchar(int c);

The first observation to make is that all these functions accept an integer int c as a parameter instead of a character char. This is because, although the value is treated as a character, these functions can also handle the special EOF (End Of File) value, which is defined as a negative integer. Therefore, to correctly handle this special value, it is necessary to use the int type.

The fputc() function writes the character specified by the parameter c to the file associated with the stream indicated by the pointer FILE *stream. It returns the written character in case of success, or EOF in case of error.

The putc() function is similar to fputc(), but it is implemented as a macro in the library, which can make it slightly more efficient in some situations. This function also writes the specified character to the file associated with the stream and returns the written character or EOF in case of error.

To use these functions, you can follow the following program schema:

#include <stdio.h>

// Write ch to the specified file
putc(ch, file);
fputc(ch, file);

The putchar() function writes the character specified by the parameter c to the standard output (usually the terminal). This function also returns the written character in case of success, or EOF in case of error.

Here is an example of using putchar():

#include <stdio.h>

putchar('A'); // Write the character 'A' to the standard output

In practice, the putchar() function is often implemented as a macro, which can make it more efficient in some situations:

#define putchar(c) putc(c, stdout)

All three functions return EOF in case of error, so it is good practice to check the return value to handle any write errors. In case of success, instead, they return the written character.

The preference between fputc() and putc() depends on the specific needs of the program, but in general, putc() can be slightly more efficient thanks to its implementation as a macro.

The last observation concerns the fact that these functions write characters without any formatting. This means that they write exactly the byte corresponding to the specified character, without adding spaces, newlines, or other special characters. This is the reason why these functions can be used to write both text files and binary files.

Character Reading Functions

The main character reading functions provided by the C language standard library are:

int fgetc(FILE *stream);
int getc(FILE *stream);
int getchar(void);

The getchar() function reads a character from the standard input (usually the keyboard) and returns the read character as an integer. In case of error or if the end of file is reached, it returns EOF. For example:

#include <stdio.h>

int ch = getchar(); // Read a character from the standard input

The fgetc() and getc() functions are similar to getchar(), but they read a character from the file associated with the stream specified by the pointer FILE *stream. Both functions return the read character as an integer, or EOF in case of error or end of file. Here too, getc() is often implemented as a macro for efficiency reasons. Here is an example of usage:

#include <stdio.h>

int ch;
ch = fgetc(file); // Read a character from the specified file
ch = getc(file);  // Read a character from the specified file

All three functions work in this way:

  1. They read a single byte from the file or from the standard input.
  2. They treat the byte as an unsigned char and promote it to int.
  3. In case of success, they return the integer value of the read character.
  4. In case of error or end of file, they return EOF.

From this it is evident that the only possible negative value returned by these functions is EOF, while all other integer values represent valid characters (from 0 to 255) and therefore valid bytes.

Also in this case, getchar is often implemented as a macro:

#define getchar() getc(stdin)

The three character reading functions behave in the same way in the presence of problems:

  • If the end of file (EOF) is reached, they return EOF and set the end of file flag on the stream.
  • In case of read error, they return EOF and set the error flag on the stream.

As we have seen, to differentiate between these two cases, it is possible to use the feof() and ferror() functions.

The classic usage of such character reading functions can be represented by the following program schema:

int ch;

while ((ch = fgetc(file)) != EOF) {
    // Process the read character
}

In practice, this loop reads character by character from the file until the end of file is reached, processing each read character inside the loop. Only when fgetc() returns EOF, the loop terminates.

Note

Always use the int type for read characters

You must always remember to declare the variable that receives the value returned by the character reading functions as int, and not as char. This is because the EOF value is a negative integer and cannot be correctly represented by a variable of type char. Using int allows distinguishing between a valid character and the special EOF value.

Comparing a char with EOF can lead to incorrect results, especially if char is an unsigned type, because in that case EOF would be converted to a very large positive value, causing a comparison that is always false.

The ungetc Function

In addition to the character reading functions seen above, there is another one called ungetc(), which allows returning a previously read character to the stream and clears the end of file flag if it was set. Its syntax is as follows:

int ungetc(int c, FILE *stream);

The use of ungetc() is particularly useful when you want to see a character in advance without removing it from the stream, that is, without reading it, so that it can be read again at a later time.

For example, suppose we want to read a series of digits from a file, but we want to stop reading as soon as we encounter a non-numeric character. In this case, we could read the character, check if it is a digit through the isdigit() function (defined in <ctype.h>), and if it is not, use ungetc() to return it to the stream so that it can be read again later.

Here is an example of using ungetc():

#include <stdio.h>
#include <ctype.h>

while (isdigit(ch = fgetc(file))) {
    // Process the read digit
}
// Return the non-numeric character to the stream
ungetc(ch, file);

However, there is an important limitation in the use of ungetc(): the number of consecutive calls to ungetc() without an intermediate read depends on the implementation of the C standard library. In general, it is guaranteed that only the first call to ungetc() after a read will succeed. Subsequent calls might fail if there is not enough space in the stream buffer to store the returned characters.

The function, furthermore, returns the character c in case of success, or EOF in case of error (for example, if the character cannot be returned to the stream).

In any case, there are more advanced functions for positioning within a file, which we will see in a later lesson. Furthermore, a call to one of these positioning functions (such as fseek(), fsetpos(), or rewind()) causes the cancellation of all characters returned with ungetc().

Example: Copying a File

Let's try to put together what we have seen so far with a simple example program that copies the contents of an input file to an output file, using the character reading and writing functions fgetc() and fputc().

This program takes as input from the command line the name of the source file and the name of the destination file, opens both files, reads the contents of the source file character by character, and writes it to the destination file. Finally, it closes both files.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
    FILE *sourceFile, *destFile;
    int ch;

    // Check the number of arguments
    if (argc != 3) {
        fprintf(stderr, "Usage: %s <source_file> <destination_file>\n", argv[0]);
        return EXIT_FAILURE;
    }

    // Open the source file in read mode
    sourceFile = fopen(argv[1], "rb");
    if (sourceFile == NULL) {
        fprintf(stderr, "Error opening the source file\n");
        return EXIT_FAILURE;
    }

    // Open the destination file in write mode
    destFile = fopen(argv[2], "wb");
    if (destFile == NULL) {
        fprintf(stderr, "Error opening the destination file\n");
        fclose(sourceFile);
        return EXIT_FAILURE;
    }

    // Copy the contents of the source file to the destination file
    while ((ch = getc(sourceFile)) != EOF)
        putc(ch, destFile);

    // Close both files
    fclose(sourceFile);
    fclose(destFile);

    return EXIT_SUCCESS;
}

An important observation concerns the use of file opening modes: in this example, we used "rb" to open the source file in binary read mode and "wb" to open the destination file in binary write mode. In this way, although we are using character I/O functions, we can use the program to copy any type of file, including binary files, without risking altering their contents due to line ending conversions or other problems related to text mode.

The way to use the program is as follows:

./copy_file source_file destination_file