Temporary Files in C

Key Takeaways
  • Temporary files are files created to store intermediate data during program execution and are not intended to be kept long-term.
  • The tmpfile() function creates a temporary file that is automatically deleted upon closing or program termination.
  • The tmpnam() function generates a unique file name that can be used to create a temporary file securely.
  • Temporary files created with tmpnam() and fopen() are not automatically deleted, so it is the programmer's responsibility to delete them when they are no longer needed.

Temporary Files

Real-world applications and programs often need to create temporary files to store intermediate data during execution.

The main characteristic of these files is that they are not intended to be kept long-term; they are created for temporary use and then deleted once they are no longer needed and in any case before program termination.

This need can arise in various scenarios, such as:

  1. When processing large amounts of data that cannot be kept entirely in memory.

    A computer's RAM is limited, and for operations involving very large datasets, it may be necessary to temporarily write data to disk to avoid exhausting memory.

  2. When performing operations that require multiple processing stages.

    In these cases, temporary files can be used to store intermediate results between different processing stages.

  3. When wanting to isolate temporary data from permanent data.

A classic example of temporary files are the swap files used by operating systems to extend virtual memory. As well as intermediate files generated by compilers, including gcc, during the process of compiling source code into object code and finally into executable.

Once such programs terminate their execution, temporary files are deleted to free up disk space and keep the system clean. There is, in fact, no reason to keep these files after the program has completed its task.

The C language standard library provides specific functionalities for managing temporary files: the tmpfile() and tmpnam() functions that we will see in detail.

The tmpfile Function

The tmpfile() function is used to create a temporary file that exists until it is closed or until program termination. This function opens a temporary file in binary read and write mode (i.e., in "wb+" mode) and returns a pointer to an object of type FILE that represents the temporary file. It is defined in the <stdio.h> header file.

Its syntax is as follows:

#include <stdio.h>

FILE *tmpfile(void);

It can, for example, be used in the following way:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    FILE *temp = tmpfile();
    if (temp == NULL) {
        printf("Error in creating temporary file.\n");
        return EXIT_FAILURE;
    }

    // Use of temporary file

    // Close the temporary file
    fclose(temp);
    return EXIT_SUCCESS;
}

Note that in case of failure, tmpfile() returns a null pointer (NULL), so it is good practice to check the returned value before using the temporary file.

Furthermore, the temporary file created by tmpfile() is automatically deleted by the system when it is closed with fclose() or when the program terminates, so there is no need to worry about deleting it manually.

The tmpfile() function is particularly useful when you need a temporary file to store intermediate data without having to manually manage the creation and deletion of the file on the filesystem. However, it has two problems:

  1. It does not allow specifying the name of the temporary file, which is automatically generated by the operating system. Not only that, it does not provide any way to know the name of the created temporary file.
  2. You cannot make the temporary file persist after closing or program termination. For example, the need might arise in some cases to want to keep the temporary file for subsequent use. This is not possible with tmpfile(), since the file is automatically deleted.

If these two limitations are problematic, one might think, then, of using the fopen function to create a temporary file with a specific name. However, this approach presents risks, as it could lead to name conflicts with existing files or security problems.

In fact, if a file with the same name already exists, fopen would overwrite it without warning, causing the loss of original data. Furthermore, a malicious user could create a file with the same name in advance, leading to security vulnerabilities.

For this reason, it is preferable to use safer methods for creating temporary files, such as the tmpnam() function that we will see now.

The tmpnam Function

The tmpnam() function, also defined in the <stdio.h> header file, does not actually create or open a temporary file. It simply generates a unique file name that can be used to create a temporary file securely.

In other words, tmpnam() provides a way to obtain a file name that is unlikely to conflict with other existing files in the filesystem.

Its syntax is as follows:

#include <stdio.h>

char *tmpnam(char *str);

The signature of the tmpnam function is a bit strange: it accepts as a parameter a pointer to a character string (char *str) and returns a pointer to a character string. The reason is simple:

  1. If the input parameter str is NULL, the tmpnam() function uses an internal buffer to store the generated temporary file name and returns a pointer to this buffer.
  2. If the input parameter str is not NULL, the tmpnam() function writes the generated temporary file name into the buffer provided by the user (i.e., in str) and returns a pointer to this buffer.

In other words, if we pass a NULL pointer as an argument, the function uses an internal buffer on which to write the generated temporary file name. This buffer is then returned as the return value of the function:

char *file_name = tmpnam(NULL);

This approach is convenient, but has the disadvantage that the internal buffer is overwritten at each subsequent call to tmpnam(), so if you want to preserve the generated temporary file name, or if you plan to call tmpnam() multiple times, it is better to provide your own buffer:

char buffer[L_tmpnam];
char *file_name = tmpnam(buffer);

Note that L_tmpnam is a macro defined in <stdio.h> that specifies the maximum length of the temporary file name generated by tmpnam(). Using this macro, we can ensure that the provided buffer is large enough to contain the temporary file name.

Let's take the above example and see how to use tmpnam() to create a temporary file securely:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    // Generate the temporary file name
    char file_name[L_tmpnam];
    if (tmpnam(file_name) == NULL) {
        printf("Error in generating temporary file name.\n");
        return EXIT_FAILURE;
    }

    // Create and open the temporary file
    FILE *temp = fopen(file_name, "wb+");
    if (temp == NULL) {
        printf("Error in creating temporary file.\n");
        return EXIT_FAILURE;
    }

    // Use of temporary file

    // Close the temporary file
    fclose(temp);

    // Delete the temporary file
    remove(file_name);
    return EXIT_SUCCESS;
}

In this example, we use tmpnam() to generate a unique and secure temporary file name, then we use fopen() to create and open the temporary file. After using the file, we close it with fclose() and manually delete it with the remove() function. We have not yet studied the remove() function, but it is defined in <stdio.h> and is used to delete a file from the filesystem. We will see it in detail in a subsequent lesson.

Note that, unlike tmpfile(), the temporary file created with tmpnam() and fopen() is not automatically deleted upon closing or program termination, so it is our responsibility to delete it when it is no longer needed.

Finally, there is another warning regarding the use of tmpnam(): there is a maximum number of times you can call tmpnam() to generate unique file names. This number is defined by the TMP_MAX macro, also defined in <stdio.h>. If you exceed this limit, tmpnam() will start returning pointers to NULL, indicating that it is no longer possible to generate unique file names. However, in most practical applications, this limit is almost never reached. For example, in a typical Linux system, TMP_MAX is often defined as 238328, which means you can generate up to 238328 unique file names before exhausting the possibilities.