Programs and Processes

In this lesson we will explore the difference between a program and a process in the C language, and how a program is loaded into memory to become a process.

Difference Between a Program and a Process

In computer science, the terms program and process are often used interchangeably, but in reality they represent two different concepts.

When programming in C, it's important to understand the distinction between these two in order to fully grasp how a program works. This is because, as a low-level language, C gives the programmer full control over system resources, including memory.

For this reason, we've dedicated a lesson to this topic before moving on to more advanced subjects.

Program

A program is a set of instructions written in a programming language like C, which are executed by a computer.

When we write C code, we are essentially writing instructions in a language understandable to humans, but not to the computer's processor. That's why we need to compile the source code into an executable program that the processor can run.

Still, even the executable program is nothing more than a set of instructions.

In other words, a program is a passive entity, residing on the computer's disk, waiting to be executed. A program has no state.

Definition

Program

A program is a set of instructions written in a programming language, such as C, potentially executable by a computer.

A program has no state.

Process

A process, on the other hand, is an instance of a program in execution.

Let's break that down:

  • An instance of a program:

    This means that a process is a copy of the program loaded into memory and executed by the operating system.

    A computer cannot directly execute a program that resides on disk. First, the program must be loaded into memory, and only then can it be run by the processor.

    Moreover, we can execute a program multiple times, even simultaneously. Therefore, there can be multiple processes from the same program.

    Think of when you open the same application multiple times on your computer. Each instance is a process.

  • In execution:

    A process is an active entity, which is currently executing the instructions of the program.

    As we said above, each process has its own copy of the running program and thus may be at a different point in execution.

    In other words, a process has a state.

    The state is represented by the current values of the program's variables, i.e., the current memory content, and the current execution point.

That's the core difference: a program is passive, while a process is active and has a state.

Definition

Process

A process is an instance of a program in execution.

A process is an active entity with a state, meaning it has a current memory content and an execution point.

Processes and Memory

A program is a set of instructions, but to be executed and become a process, it must be loaded into memory.

The task of loading a program into memory so it can be executed is handled by the operating system. This procedure is known as program loading and is performed by a component of the OS called the loader.

To study the C language, we don't need to delve deeply into how loading works. Rather, it's crucial to understand what happens after the program is loaded into memory. In fact, the loader does not simply load the instructions into memory.

Once a program is loaded into memory, the operating system reserves memory areas for the process called segments.

Segments are memory portions reserved to contain various parts of a program. A process is typically composed of four main segments:

  1. Text (or text segment, .text): contains the program instructions;
  2. Data (or data segment, .data): contains global and static variables;
  3. Stack (or stack segment, .stack): contains local variables and function call data;
  4. Heap (or heap segment, .brk): contains dynamically allocated variables during program execution.

Schematically, we can represent the memory of a process as shown below:

Memory Segments of a Process
Picture 1: Memory Segments of a Process

The text segment contains the program instructions, while the data segment contains global and static variables. These two segments are the only ones whose size is known in advance, as it depends on the program's source code.

In contrast, the size of the stack and heap segments is variable and depends on program execution. Specifically, as we will see, the stack grows downward, and the heap grows upward. This means that during execution, these two segments can dynamically grow or shrink.

The purpose of the stack and heap is fundamental, but for now, it's too early to explore them in depth.

We will understand the stack's role when we study functions, and the heap when we discuss dynamic memory allocation.

For now, it is important to know that in order to be executed, a program must be loaded into memory and become a process—and that a process is composed of four main segments: text, data, stack, and heap.

In Summary

In this lesson we explored the difference between a program and a process.

A program is a set of instructions written in a programming language, while a process is an instance of a program in execution.

We learned that to be executed, a program must be loaded into memory and become a process. A process is composed of four main segments: text, data, stack, and heap.

These concepts are fundamental to understanding how a C program works and, in particular, how variables and memory are managed.

In the next lessons, we'll explore how memory segments work and how local and global variables are managed.