Input and Output on Strings in C

In the standard library of the C language, there are special input and output functions that allow reading and writing data not on files, but on character strings in memory, treating them as if they were streams.

The sprintf and snprintf functions allow writing formatted data into a character string in memory, in a similar way to what the printf function does for console output and the fprintf function does for file output.

Similarly, the sscanf function allows reading formatted data from a character string in memory, in a similar way to what the scanf function does for console input and the fscanf function does for file input.

The advantage of using these functions is that we can leverage all the power of the C language formatting functions, including format specifiers, flags, field width and precision, without necessarily having to write to files or console. This is particularly useful when you want to build complex strings in memory or parse input strings without having to interact with external devices.

Key Takeaways
  • The sprintf and snprintf functions allow writing formatted data to character strings in memory.
  • The sscanf function allows reading formatted data from character strings in memory.
  • These functions are useful for building or parsing complex strings without having to interact with files or console.
  • snprintf is a safer version of sprintf, as it prevents buffer overflows by specifying the maximum size of the destination buffer.

String Output Functions: sprintf and snprintf

The sprintf function is very similar to the printf function, which prints to standard output, and to the fprintf function which instead prints to a file or stream. The difference is that the function's output is saved to a string, that is, an array of characters in memory, passed as the first argument to the function.

Its declaration is as follows:

int sprintf(char *stringa, const char *formato, ...);

As you can see, the first argument stringa is a pointer to the character string where the formatted output will be saved. The second argument formato is a format string that specifies how the subsequent variable arguments should be formatted and inserted into the output string.

Like the other functions of the printf family, the declaration of sprintf ends with three dots ..., which indicate that the function can accept a variable number of arguments, depending on the format specifiers present in the formato string.

For example, suppose we want to create a string that contains a formatted date, we can use sprintf as follows:

int giorno = 15;
int mese = 8;
int anno = 2023;
char data[20];

sprintf(data, "Data: %02d/%02d/%04d", giorno, mese, anno);
printf("%s\n", data); // Output: Data: 15/08/2023

In this example, the sprintf function formats the date using the format specifiers %02d for the day and month (with two digits and leading zeros) and %04d for the year (with four digits). The formatted output is saved in the data string.

The sprintf function also performs two other important operations:

  1. It automatically adds the string termination character \0 at the end of the output string.
  2. It returns the number of characters actually written to the output string, excluding the termination character \0.

If an error occurs, for example an encoding problem, the function returns a negative value.

sprintf can be used in various scenarios. For example, you might want to first build a complex string in memory before sending it to a function that stores it in a file or sends it over a network connection. Additionally, it can be useful in converting numerical data to strings for display or logging purposes.

However, it is important to note that sprintf does not perform any checks on the size of the destination buffer. If the formatted string exceeds the buffer size, a buffer overflow will occur, which can lead to unexpected behavior or security vulnerabilities. For this reason, the use of sprintf is discouraged in favor of snprintf, which allows specifying the maximum size of the destination buffer.

The snprintf function has a declaration similar to that of sprintf, but with an additional argument that specifies the maximum size of the destination buffer:

int snprintf(char *stringa, size_t dimensione, const char *formato, ...);

It is essentially a safer version of sprintf, as it prevents buffer overflows by limiting the number of characters written to the output string to dimensione - 1, reserving one character for the \0 terminator.

The snprintf function also returns a negative value in case of error. In case of success, however, it behaves slightly differently than sprintf: it returns the number of characters that would have been written if the buffer had been large enough, excluding the termination character \0. This means that if the returned value is greater than or equal to dimensione, it means that the output was truncated. This allows the programmer to know if the buffer was too small and to act accordingly.

For example, here is how to use snprintf to create a formatted string safely:

int giorno = 15;
int mese = 8;
int anno = 2023;
char data[10];

int n = snprintf(data, sizeof(data), "Data: %02d/%02d/%04d", giorno, mese, anno);
if (n < 0) {
    // Error handling
} else if (n >= sizeof(data)) {
    // Output was truncated
    printf("Buffer too small, output truncated.\n");
} else {
    printf("%s\n", data); // Output: Data: 15/08/2023
}

String Input Function: sscanf

The sscanf function is similar to the scanf function, which reads from standard input, and to the fscanf function, which reads from a file or stream. The difference is that sscanf reads data from a character string in memory, passed as the first argument to the function.

Its declaration is as follows:

int sscanf(const char *stringa, const char *formato, ...);

As you can see, the first argument stringa is a pointer to the character string from which the formatted data will be read. The second argument formato is a format string that specifies how the subsequent variable arguments should be interpreted and read from the input string.

Also in this case, the declaration of sscanf ends with three dots ..., which indicate that the function can accept a variable number of arguments, depending on the format specifiers present in the formato string.

The sscanf function is particularly useful for extracting data from a string that has been previously built or received, for example from a text file or from a network connection.

For example, one might think of using the fgets function to read a line of text from a file and then using sscanf to extract numerical values from that line.

Suppose, for example, we want to read numerical data from a CSV (Comma-Separated Values) file and extract the values into separate variables.

The CSV text file might contain lines like the following:

123,45.67
214,89.01

That is, an integer followed by a comma and a floating-point number.

We could combine fgets and sscanf to read each line of the file and extract the values as follows:

char riga[100];
fgets(riga, sizeof(riga), file); // Reads a line from the file

int id;
float valore;
sscanf(riga, "%d,%f", &id, &valore); // Extracts values from the line

At first glance, one might think that using fgets and sscanf in this way is useless, since there is the fscanf function that allows reading directly from a file in a formatted way. The advantage of this combination, however, is that it allows reading an entire line into memory and then analyzing it as many times as desired, without having to reread from the file. This can be useful in situations where you want to perform multiple processing passes on the read data, or when you want better error handling in parsing.

Let's think of an example. Suppose we want to read dates from a text file. However, the file can contain dates in different formats, for example DD/MM/YYYY or DD-MM-YYYY. Using fgets and sscanf, we can read each line of the file and then attempt to parse the date in both formats:

char riga[100];
fgets(riga, sizeof(riga), file); // Reads a line from the file

int giorno, mese, anno;
if (sscanf(riga, "%d/%d/%d", &giorno, &mese, &anno) == 3 ||
    sscanf(riga, "%d-%d-%d", &giorno, &mese, &anno) == 3) {
    // Date read successfully
} else {
    // Date parsing error
}

This would not have been possible using fscanf directly since the latter would have advanced the current file position with each read attempt, making it difficult to attempt to read the same line multiple times. In fact, to reread the same line with fscanf, it would have been necessary to then use the fseek function to reposition the file cursor at the beginning of the line, considerably complicating the code.

Like the scanf function, sscanf also returns the number of elements successfully read from the input string. In case of end of file, which in the case of sscanf occurs when there is no more data to read in the string, the function returns EOF. Similarly, if some elements cannot be read due to a formatting error, the function returns the number of elements successfully read.