Copying Strings in C

In C language, you cannot use the assignment operator = to copy the content of one string into another string.

For this purpose, two functions have been introduced in the C standard library: strcpy and strncpy.

These two functions are defined in the header file string.h and allow you to copy a string into another string. In this lesson, we will see how these two functions work and how we can use them safely.

Function strcpy - Copying a String

The strcpy function copies a string into another string. The destination string must be large enough to contain the source string.

To use it, you need to include the header file string.h:

#include <string.h>

Its prototype is as follows:

char *strcpy(char *dest, const char *src);

The function copies the content of src into dest and returns dest. Specifically, the function copies all characters of src into dest, including the termination character '\0'. For this reason, you must ensure that the destination string has enough space to contain the source string. The function returns a pointer to the destination string, that is, dest.

In particular, if the source string is composed of n characters, the destination string must have at least n + 1 characters to contain the source string and the termination character '\0'.

The existence of this function in the standard library compensates for the fact that in C language there is no assignment operator for strings. Specifically, assigning one string to another string does not copy the content of the source string into the destination string, but copies the pointer to the source string into that of the destination string. This means that if you assign one string to another string, and then modify the source string, the destination string will also be modified.

char s1[] = "Hello";
char s2[] = "World";

s2 = s1; // s2 points to the same string as s1

s1[0] = 'J'; // s2[0] becomes 'J'

printf("%s %s\n", s1, s2); // prints "Jello Jello"

Moreover, an initialization expression of this kind is incorrect:

char s1[128];

s1 = "Hello"; // ERROR
Note

The assignment operator = cannot be used to copy strings

Since strings in C language are, in essence, pointers to character arrays, the assignment operator = cannot be used to copy the content of one string into another string. Instead, the assignment operator = copies the pointer to the source string into that of the destination string.

The following code is incorrect:

/* ERROR */
char s1[128];

s1 = "Hello";

For this reason, in both cases it is necessary to use the strcpy function to copy a string into another string.

char s1[] = "Hello";
char s2[] = "World";

strcpy(s2, s1); // copies the string of s1 into s2

s1[0] = 'J'; // s2[0] remains 'H'

printf("%s %s\n", s1, s2); // prints "Jello Hello"

Note that, in the example above, s2 and s1 are two arrays with the same number of characters. Therefore, copying the string s1 into s2 does not cause problems.

char s1[128];

strcpy(s1, "Hello"); // copies the string "Hello" into s1

In most cases, the result of strcpy is discarded. There are cases where, instead, it is useful to use that result. For example, suppose we want to initialize two strings with the same string literal. We can concatenate two invocations of the strcpy function as follows:

char s1[128];
char s2[128];

strcpy(s2, strcpy(s1, "Hello")); // copies the string "Hello" into s1 and into s2
Definition

Function strcpy

The strcpy function copies a string into another string.

It is defined in the header file string.h:

#include <string.h>

Its signature is as follows:

char *strcpy(char *dest, const char *src);
  • dest is a pointer to the destination string;
  • src is a pointer to the source string.

The function copies the content of src into dest and returns dest. Specifically, the function copies all characters of src into dest, including the termination character '\0'.

When using the strcpy function, you must ensure that:

  • src points to a valid location;
  • dest has enough space to contain the source string.
Note

The strcpy function is not safe

The strcpy function does not check if the destination string is large enough to contain the source string. If the destination string is not large enough, the strcpy function could write beyond the bounds of the destination string, causing a segmentation fault. For this reason, it is necessary to ensure that the destination string has enough space to contain the source string.

To avoid these problems, the strncpy function was introduced in the C standard. When writing programs, it is advisable to always use the strncpy function instead of strcpy.

Function strncpy - Copying a String with Control

The strncpy function copies a string into another string, so it is similar to the strcpy function. The difference is that the strncpy function accepts one more parameter, that is, the maximum number of characters to copy.

Its prototype is as follows:

char *strncpy(char *dest, const char *src, size_t n);

The function copies the content of src into dest and returns dest. Specifically, the function copies at most n characters of src into dest. The function returns a pointer to the destination string, that is, dest.

For example:

#define MAX_LEN 128

char s1[] = "Hello World!";
char s2[MAX_LEN];

/* Copies at most 128 characters of s1 into s2 */
strncpy(s2, s1, MAX_LEN);

In this example, as long as the string s2 is large enough to contain the string s1, the strncpy function will copy the string s1 into s2. If the string s2 is not large enough to contain the string s1, the strncpy function will copy only the first n characters of s1 into s2. In this case, at most 128 characters will be copied (including the string terminator).

For this reason, the strncpy function also hides a security problem. If the destination string is not large enough to contain the source string, the strncpy function does not copy the termination character '\0'. A safer way to invoke it is as follows:

#define MAX_LEN 128

char s1[] = "Hello World!";
char s2[MAX_LEN];

/* Copies at most 127 characters of s1 into s2 */
strncpy(s2, s1, MAX_LEN - 1);

/* Adds the string terminator */
s2[MAX_LEN - 1] = '\0';

This way we ensure that there is at least one space for the termination character '\0' which, however, we will have to insert manually. Using this technique, we will always be sure that the string s2 is terminated correctly.

Definition

Function strncpy

The strncpy function copies a string into another string, limiting the number of characters copied.

It is defined in the header file string.h:

#include <string.h>

Its signature is as follows:

char *strncpy(char *dest, const char *src, size_t n);
  • dest is a pointer to the destination string;
  • src is a pointer to the source string;
  • n is the maximum number of characters to copy.

The function copies at most n characters of src into dest and returns dest. If the source string is shorter than n, the function copies all characters of src into dest, including the termination character '\0'. If the source string is longer than n, the function copies only the first n characters of src into dest.

When using the strncpy function, you must ensure that:

  • src points to a valid location;
  • dest has enough space to contain at least n characters, including the termination character '\0'.
Note

The strncpy function is not safe

The strncpy function does not copy the termination character '\0' if the destination string is not large enough to contain the source string. For this reason, it is necessary to ensure that the destination string has enough space to contain at least n characters, including the termination character '\0'.

Hint

Using strncpy

A safe way to use the strncpy function is as follows:

  • Assuming that the destination string can contain at most n characters (including the terminator '\0'), invoke the strncpy function with n - 1 as the third argument;
  • After invoking strncpy, manually add the terminator '\0' at the end of the destination string.
#define MAX_LEN 128

char destination[MAX_LEN];

strncpy(destination, source, MAX_LEN - 1);
destination[MAX_LEN - 1] = '\0';

Implementation of strcpy and strncpy

Although they are library functions, it is very instructive to implement strcpy and strncpy from scratch. Implementing them from scratch allows us to better understand how strings work in C language. It also allows us to understand why these functions are vulnerable and how we can protect ourselves from these problems.

Let's start with a possible implementation of strcpy which we will call my_strcpy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
char *my_strcpy(char *destination, const char *source) {
    char *p = destination;

    while (*source != '\0') {
        *destination = *source;
        destination++;
        source++;
    }

    *destination = '\0';

    return p;
}

Let's analyze the code:

  • Line 2: We initialize a pointer p to the destination string. This will allow us to return the destination string at the end of the function.
  • Line 4-8: We copy the characters of the source string into the destination string until we reach the termination character '\0'. For this we use a while loop that scrolls through the characters of source and copies them into destination.

    Note that, at lines 6 and 7, we increment the pointers destination and source to move to the next character.

  • Line 10: We add the termination character '\0' at the end of the destination string.

  • Line 12: We return the destination string.

This is, more or less, the implementation of strcpy in the C standard library. If we observe it carefully, we can notice what the security problems of this function are:

  • The first problem is that the function does not check the validity of the pointers. Specifically, it does not check if source and destination point to valid memory locations. If one of the two pointers is not valid, the behavior of the function is undefined.

  • The second problem is that the function does not check if the destination string has enough space to contain the source string. After all, it would have no way to do so. In C, unfortunately, strings do not contain, by default, information regarding their length. For this reason, if the destination string is not large enough to contain the source string, the strcpy function will write beyond the bounds of the destination string, causing a segmentation fault.

    Specifically, suppose that destination is a string that can contain at most 10 characters. If source is a string of 17 characters, for example "Hello, how are you?", this happens:

    • Initially, assuming that destination contains only null characters, the destination string is as follows:

      +----+----+----+----+----+----+----+----+----+----+
      | \0 | \0 | \0 | \0 | \0 | \0 | \0 | \0 | \0 | \0 |
      +----+----+----+----+----+----+----+----+----+----+
        ^
        |
        p
      

      The pointer p points to the first location.

    • The function starts copying the first characters of source into destination. After copying the first 10 characters, the destination string is as follows:

      +----+----+----+----+----+----+----+----+----+----+
      | H  | e  | l  | l  | o  | ,  |    | h  | o  | w  |
      +----+----+----+----+----+----+----+----+----+----+
                                                      ^
                                                      |
                                                      p
      

      The pointer p points to the tenth character of destination.

    • But, at this point, since the string source is not terminated, the function continues to copy the characters of source into the adjacent memory locations:

      +----+----+----+----+----+----+----+----+----+----+    +    +    +
      | H  | e  | l  | l  | o  | ,  |    | h  | o  | w  |    | a  | r  |
      +----+----+----+----+----+----+----+----+----+----+    +    +    +
                                                                    ^
                                                                    |
                                                                    p
      
    • These locations, however, do not belong to the destination string. What happens in these cases is not predictable and can cause undefined behaviors. In the worst case, the program could terminate with a segmentation fault.

The strncpy function was introduced in the C standard library to solve the second problem. Let's see an implementation of strncpy which we will call my_strncpy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
char *my_strncpy(char *destination, const char *source, size_t n) {
    char *p = destination;

    while (n > 0 && *source != '\0') {
        *destination = *source;
        destination++;
        source++;
        n--;
    }

    while (n > 0) {
        *destination = '\0';
        destination++;
        n--;
    }

    return p;
}

Let's analyze the code:

  • Line 2: We initialize a pointer p to the destination string. This will allow us to return the destination string at the end of the function.
  • Line 4-9: We copy at most n characters of the source string into the destination string until we reach the termination character '\0'. To do this, we use a while loop that scrolls through the characters of source and copies them into destination. The loop terminates when we have copied n characters or when we reach the termination character '\0'.

    • Note that, at lines 6, 7 and 8, we increment the pointers destination and source to move to the next character and decrement n to keep track of the number of characters copied.

    • The condition of the while, in fact, checks that there are still characters to copy (n > 0) and that we have not reached the termination character '\0' (*source != '\0').

  • Line 11-15: At this point, two situations may have occurred:

    • We have copied all characters of source into destination, but the source string is not terminated. In this case, we must add the termination character '\0' at the end of the destination string. To do this, we use a second while loop that adds the termination character '\0' a number of times equal to the remaining characters to copy (n).

    • We have copied n characters of source into destination, but the source string is not terminated. In this case the second while loop is not executed.

  • Line 17: We return the destination string.

This is also, approximately, the implementation of strncpy in the C standard library. Although this function improves security compared to strcpy, it is not immune from problems. Let's see what they are:

  • The strncpy also does not check the validity of the pointers. If one of the two pointers is not valid, the behavior of the function is undefined.

  • It does not check the validity of the parameter n. If n is greater than the actual length of destination, even in this case the function will write beyond the bounds of the destination string.

In Summary

In C language we cannot use the assignment operator, =, to copy the content of one string into another string. This is because, in C, strings are pointers to character arrays. Therefore, the assignment operator = copies the pointer to the source string into that of the destination string. To copy the content of one string into another string, we must use the library functions strcpy or strncpy.

  • The strcpy function copies a string into another string. The destination string must be large enough to contain the source string.
  • The strncpy function copies a string into another string, limiting the number of characters copied.

In the next lesson we will see another important function of the C standard library for string manipulation: strlen.