Anatomy of a C Program

Let's delve into how a C program is structured. We'll explore the directives, functions, and statements that make up a C program.

The General Form of a Program

In this lesson, let's revisit the simple example program we saw in the previous lesson: hello.c.

#include <stdio.h>

int main() {
    printf("Hello, World!\n");
    return 0;
}

We can generalize the structure of this program as follows:

directives

int main() {
    statements
}

Let's break down this structure. The first thing to note is that the program's body, that is, the main function, is enclosed within two curly braces {} which mark its beginning and end.

In C, curly braces { and } are used the same way other languages use keywords like begin and end. This shows how C is a very concise and compact language. It makes extensive use of special symbols and abbreviations to reduce the amount of code required. While this can be seen as a strength, it can also be a weakness, especially for beginners, as it may make the code seem cryptic.

That said, the structure of any C program, even the simplest one, can be broken down into three key elements:

  • directives: commands that modify the source code before actual compilation;
  • functions: blocks of code that perform specific actions, such as the main function;
  • statements: commands executed when the program runs.

C is an imperative and procedural language:

  • Imperative language means it executes all statements in sequence, one after the other;
  • Procedural language means it organizes code into functions—blocks that carry out specific tasks.

Let's now examine these three components in detail.

Directives

As mentioned in the previous lesson, a C source file is preprocessed before being compiled. This means a program called the preprocessor modifies the code before compilation.

The specific commands used by the preprocessor are known as preprocessing directives. We will study them in depth in the chapter on the preprocessor. For now, we'll introduce a key directive: #include.

The example program hello.c starts with the following line:

#include <stdio.h>

This directive tells the preprocessor to include the contents of the stdio.h file into the program before it is compiled. The file stdio.h contains information related to C's Input and Output library.

C compilers provide access to a standard C library. It's called a library even though there are different implementations — all serving the same purpose: to offer a set of functions and data structures that make C programs portable across different systems.

Alongside the library are header files, which contain the necessary information to use the library functions. These files typically have a .h extension and are included in a program via the #include directive.

The reason we add the above line is that C does not have built-in Input/Output functions like some other languages. So, any program that needs to communicate with the outside world must rely on the standard library.

Definition

Preprocessing Directives

Preprocessing directives are commands executed before the actual compilation of the program. These commands begin with the # symbol and are used to modify the source code so it can be compiled.

In general, all C programs that need to read or write to the console must include the stdio.h library using the directive #include <stdio.h>.

Directives always begin with the hash # to distinguish them from other program statements and never end with a semicolon ;.

That's all you need to know about directives for now. In future lessons, we'll look deeper into preprocessing and other important directives.

Functions

In C, functions are like procedures or subroutines in other programming languages. In other words, they are the fundamental building blocks of programs.

Since C is a procedural language, a program is simply a collection of functions that call each other to perform specific tasks.

Broadly speaking, we can divide functions into two categories:

  • User-defined functions: functions written by the user to perform specific tasks;
  • Standard library functions: functions that belong to the standard C library and are available for use.

The term function comes from mathematics, where a function is a rule for computing a result based on input arguments. For example, math functions include:

f(x, y) = \sqrt{x^2 + y^2}
g(x) = x^2 + 2x + 1

In C, a function is a bit different. It's essentially a group of statements executed in sequence. A C function may or may not return a value.

If a function needs to return a result, the return statement is used to return a value.

We will explore functions in detail in a later chapter.

For now, it's important to understand that although a C program can contain many functions, it must contain a main function.

The main function is special. It represents the entry point of the program. When a C program is run, the operating system looks for and executes the main function.

Until we reach the chapter on functions, the main function will be the only function in our programs.

Definition

The main Function

The main function is the primary function in a C program. It marks the entry point of the program. Every C program must have a main function.

Note

The name main is essential

C is case-sensitive, meaning it distinguishes between uppercase and lowercase letters. So the function name must be written exactly as main, not Main, MAIN, mAIN, etc.

Since main is a function, it can return a value. If we go back to our example program hello.c, we see that the main function ends with this line:

return 0;

This means the main function returns an integer value—specifically, the number 0. That it returns an integer is indicated by this declaration:

int main() {

In other words, we're telling the compiler that main returns an integer (int). We'll learn more about specifying return types later.

For now, just know that the main function should return an integer.

So why return the number 0?

When the program reaches the return 0; statement, two things happen:

  1. The program ends;
  2. The number 0 is sent to the operating system.

The OS interprets a return value of 0 as an indication that the program ended successfully. A non-zero return value signals an error. This convention began in UNIX and is still followed by most modern systems, including Linux, Windows, and macOS.

If we forget to include the return 0; statement, most compilers will issue a warning.

Statements

A statement is a command that the program executes while running.

Since C is an imperative language, a C program executes all its statements sequentially—one after another in the order they appear.

We'll examine statements in depth in upcoming lessons. For now, let's look again at our hello.c program.

We see that it contains just two statements:

printf("Hello, World!\n");
return 0;

We've already discussed the second—it returns 0 and signals successful termination.

The first is a function call. When we want to use a function for a specific task, we say we call or invoke it. In our case, we're calling printf to print a message to the screen.

A fundamental rule in C is that every statement must end with a semicolon ;. The semicolon tells the compiler the statement has ended. It's especially important because it allows us to split statements over multiple lines.

Only directives do not end with a semicolon and cannot span multiple lines.

There are some exceptions involving code blocks, which we'll study later.

Definition

Statements

A statement is a command executed by the program during its runtime. Each statement must end with a semicolon ;.

Printing Messages

Let's wrap up this lesson with an introduction to the printf function. We'll go into more detail in a dedicated lesson, but it's important to understand the basics now.

The printf function is used to print messages to the screen. In the hello.c example, it is used like this:

printf("Hello, World!\n");

When executed, printf prints Hello, World! to the screen—without the quotes. That's because quotes in C indicate that the text inside is a string literal.

Strings in C are sequences of characters enclosed in double quotes.

An important detail is the \n in the message. By default, printf does not move to a new line after printing. To move to a new line, we must use the \n escape sequence, which represents a newline character.

Escape sequences begin with \ and represent special characters. For example, \n is newline, \t is tab, and so on.

We could write the message over two lines like this:

printf("Hello,\nWorld!\n");

Resulting in:

Hello,
World!

Or split the message into two calls without a line break:

printf("Hello, ");
printf("World!\n");

Which prints:

Hello, World!

That's all you need to know about printf for now. We'll explore it more thoroughly in its own lesson.

In Summary

In this lesson, we covered the general structure of a C program. We saw that a C program is made up of three fundamental components:

  • directives: preprocessing commands executed before compilation;
  • functions: blocks of code that perform specific actions;
  • statements: commands executed when the program runs.

We learned that the main function is the primary function in a C program and that it must return an integer. A return value of 0 indicates that the program ended successfully.

Finally, we saw how to use the printf function to print messages to the screen.

In the next lesson, we'll study another essential element of C programs: comments.