Determining the length of a string in C language

In C language, strings are character arrays terminated by the special character \0. To determine the length of a string, that is the number of characters that compose it, we cannot use the sizeof operator, but we must use appropriate library functions defined in string.h.

These functions are strlen, strnlen and strnlen_s. The strlen function returns the length of a string, strnlen returns the length of a string up to an upper limit and strnlen_s is a safe version of strnlen introduced in the C11 standard.

In this lesson we will study these functions in detail and analyze a possible implementation.

Function strlen - Length of a string

We know that a string in C language is a sequence of characters that ends with the terminator character \0.

In C we cannot use the sizeof operator to determine the length of a string for two reasons:

  • If applied to a character array, sizeof returns the size of the array, not the length of the string:

    char s[128] = "Hello";
    
    printf("%zu\n", sizeof(s)); // prints "128"
    

    In this case sizeof returns the size of the array s, which is 128 bytes, but the string s contains only 5 characters plus the terminator \0.

  • If applied to a pointer, sizeof returns the size of the pointer, not the length of the string:

    char *s = "Hello";
    
    printf("%zu\n", sizeof(s)); // prints "8" on a 64-bit system
    

    In this case sizeof returns the size of the pointer s, which is 8 bytes on a 64-bit system.

To determine the length of a string, we must use the strlen function from the standard library string.h.

The strlen function accepts a pointer to a string and returns the number of characters in the string, excluding the terminator \0. Its signature is:

size_t strlen(const char *s);

Let's see some examples:

#include <stdio.h>
#include <string.h>

int main() {
    char s[128] = "Hello";
    char *t = "World";
    char *v = "";

    printf("The length of s is %zu\n", strlen(s)); // prints "5"
    printf("The length of t is %zu\n", strlen(t)); // prints "5"
    printf("The length of v is %zu\n", strlen(v)); // prints "0"

    return 0;
}

This example allows us to make some important observations:

  • In the first case, strlen(s), the function returns 5, which is the length of the string s excluding the terminator \0. It does not return 128, which is the size of the array s.
  • In the second case, strlen(t), the function returns 5, which is always the length of the string t excluding the terminator \0. It does not return 8, which is the size of the pointer t.
  • In the last case, we invoked strlen on an empty string, v, which is composed exclusively of the terminator \0. The function returns 0, since the string is composed of the single terminator \0.
Definition

Function strlen

The strlen function accepts a pointer to a string and returns the number of characters in the string, excluding the terminator \0.

It is defined in the header file string.h:

#include <string.h>

Its signature is:

size_t strlen(const char *s);
Note

The strlen function is not safe

The strlen function has two serious problems:

  1. It does not verify if the passed pointer is valid. If the pointer does not point to a valid string, the behavior is undefined.
  2. It does not verify if the string is actually terminated with the character \0. If the string is not terminated correctly, the behavior is undefined.

Function strnlen - Length of a string with limit

To avoid the problem of strlen, we can use the strnlen function. This function, in fact, is not a standard function of the C language. But many compilers, including GCC, Clang and Microsoft Visual C++, provide it as an extension of the standard library.

The strnlen function accepts a pointer to a string and an upper limit and returns the number of characters in the string, excluding the terminator \0, up to the upper limit. Its signature is:

size_t strnlen(const char *s, size_t maxlen);

In particular, if the string s is terminated before reaching maxlen, the function returns the length of the string. If the string is not terminated before reaching maxlen, the function returns maxlen.

Let's see an example:

#include <stdio.h>
#include <string.h>

int main() {
    char s[128] = "Hello, World!";

    /* Prints 13 */
    printf("The length of s is %zu\n", strnlen(s, 128));

    /* Prints 5 */
    printf("The length of s is %zu\n", strnlen(s, 5));

    return 0;
}

In this example, the string s is "Hello, World!", which has a length of 13 characters. However, when we invoke strnlen with different limits, we get different results:

  • When we invoke strnlen(s, 128), the function returns 13, which is the length of the string s excluding the terminator \0.
  • When we invoke strnlen(s, 5), the function returns 5, which is the upper limit passed as an argument.
Definition

Function strnlen

The strnlen function accepts a pointer to a string and an upper limit and returns the number of characters in the string, excluding the terminator \0, up to the upper limit.

It, typically, is defined as an extension of the standard library in many compilers, including GCC, Clang and Microsoft Visual C++ and is defined in the header file string.h:

#include <string.h>

Its signature is:

size_t strnlen(const char *s, size_t maxlen);
Note

The strnlen function is not standard

The strnlen function is not defined in the C standard, but is an extension of some standard libraries such as GNU libc and Microsoft Visual C++.

Function strnlen_s - Length of a string with limit and error checking

Starting from the C11 standard, the strnlen_s function was introduced which is a safe version of strnlen. The strnlen_s function accepts a pointer to a string and the upper limit and returns the number of characters in the string, excluding the terminator \0, up to the upper limit. In this it is similar in every way to strnlen.

However, strnlen_s also checks if the passed pointer is valid. If the pointer does not point to a valid string it returns 0.

Its signature is:

size_t strnlen_s(const char *s, size_t maxlen);
Definition

Function strnlen_s

The strnlen_s function, defined in the C11 standard, accepts a pointer to a string and an upper limit and returns the number of characters in the string, excluding the terminator \0, up to the upper limit. If the string is not valid, it returns 0.

It is defined in the header file string.h:

#include <string.h>

Its signature is:

size_t strnlen_s(const char *s, size_t maxlen);

Let's see an example:

#include <stdio.h>
#include <string.h>

int main() {
    char s[128] = "Hello, World!";

    /* Prints 13 */
    printf("The length of s is %zu\n", strnlen_s(s, 128));

    /* Prints 5 */
    printf("The length of s is %zu\n", strnlen_s(s, 5));

    /* Prints 0 */
    printf("The length of s is %zu\n", strnlen_s(NULL, 128));

    return 0;
}

As can be seen from this example, the strnlen_s function behaves like strnlen. However, if the passed pointer is not valid, the function returns 0 as seen in the last case.

Implementation of strlen, strnlen and strnlen_s

For educational purposes, it is interesting to see how we can implement the strlen, strnlen and strnlen_s functions in C language. This will also allow us to see what the security problems are that these functions have.

Let's start with a simple implementation of strlen that we will call my_strlen:

1
2
3
4
5
6
7
8
9
size_t my_strlen(const char *s) {
    const char *p = s;

    while (*p != '\0') {
        p++;
    }

    return p - s;
}

Let's analyze the code:

  • Line 1: We define the my_strlen function that accepts a pointer to a string s and returns the length of the string;
  • Line 2: We define a pointer p and initialize it with the pointer s, that is with the string passed as an argument;
  • Line 4-6: We iterate until we reach the terminator \0. At each iteration we increment the pointer p;
  • Line 8: We return the difference between the pointer p and the pointer s, which is the length of the string.

In particular, this last point is interesting: the difference between two pointers is the number of elements between the two pointers. In this case, the difference between p and s is the number of characters between p and s, that is the length of the string.

In the code above, we can rewrite the loop at lines 4-6 in a more concise way:

while (*p++);

This loop increments the pointer p and evaluates the value pointed to by p until it reaches the terminator \0 which, being equal to ASCII code 0, is evaluated as false and therefore closes the loop.

If we observe the code carefully, we can understand the problems that this function presents:

  1. It does not check if the pointer s is valid. If s is NULL, the behavior is undefined;
  2. It does not check if the string is actually terminated with the character \0. If the string is not terminated correctly, the behavior is undefined. In fact in this case, the while loop at line 4 could go on forever.

The strnlen function, on the other hand, accepts an upper limit and solves the second problem.

Let's see a possible implementation of strnlen that we will call my_strnlen:

1
2
3
4
5
6
7
8
9
size_t my_strnlen(const char *s, size_t maxlen) {
    const char *p = s;

    while (*p != '\0' && p - s < maxlen) {
        p++;
    }

    return p - s;
}

The implementation is very similar to that of strlen, with one difference: we must check that we do not exceed the limit maxlen. To do this, we add a condition to the while loop, line 4, which checks both the terminator \0 and the difference between p and s.

In this way, the my_strnlen function returns the length of the string up to the limit maxlen.

This function is certainly safer than strlen, but still presents a problem: it does not check if the pointer s is valid. If s is NULL, the behavior is undefined.

The strnlen_s function, finally, also solves the first problem, that is it checks if the pointer s is valid.

Let's try, finally, to implement the strnlen_s function. Let's create a version that we will call my_strnlen_s:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
size_t my_strnlen_s(const char *s, size_t maxlen) {
    if (s == NULL) {
        return 0;
    }

    const char *p = s;

    while (*p != '\0' && p - s < maxlen) {
        p++;
    }

    return p - s;
}

In this case, the my_strnlen_s function is identical to my_strnlen, with the addition of an initial condition, lines 2-4, which checks if the pointer s is valid. If the pointer s is NULL, the function returns 0.

In this way, the my_strnlen_s function is safe both with respect to the pointer s and to the limit maxlen.

In Summary

In this lesson we have seen how to determine the length of a string in C language. We have seen that we cannot use the sizeof operator to determine the length of a string and that we must use the strlen function from the standard library string.h.

We have seen that the strlen function is not safe, as it does not check if the passed pointer is valid and if the string is actually terminated with the character \0.

To avoid these problems, we have seen the strnlen function, which accepts an upper limit and returns the length of the string up to the upper limit. Furthermore, we have seen the strnlen_s function, introduced in the C11 standard, which also checks if the passed pointer is valid.

Finally, we have seen how to implement the strlen, strnlen and strnlen_s functions in C language and why the first two functions are not safe.