Booting

How do computers boot up? #

When we power up a computer, how does it know where to find an operating system? How does a computer identify that a particular program or executable file is an operating system and decide it will have the privilege to control the system? How does it know how to boot an entire operating system which may consist of more than a single executable file? How does an operating system take control of the system?

Amplification
Predefined address translation and file format
Hierarchical search paths
Privileged mode and interrupt vector

Dror G. Feitelson described the main idea behind booting a system as amplification. It means that we start by executing a very short program that will load a bigger program, which can then load other longer programs. The immediate questions would be – How would we load the first short program? How short does it have to be? How would it know where to look for the subsequent program? What will happen if any of these were to be updated?

The idea that solves the problem of loading the first executable is predefined address translation. We’ll always start executing instructions stored in a particular hardware (e.g., a non-volatile Read-Only Memory) at a predefined physical memory location (e.g., the first physical block of the primary storage device aka boot sector, when loading the boot loader), after we turn on a computer. But a single byte or a block definitely can’t store all information about how to load the entire executable, which can vary in size in different systems or even in different versions of the same system. Since a files system is yet to be initialized, how would the short program know how to load the next executable file correctly?

The above problem can be solved simply by following a predefined file format, where a fixed number of bytes at the beginning will contain the information about the executable, followed by the entry point for the executable instructions. Let’s say, we use the first four bytes to provide information about the executable and the fifth byte contains the first instruction to be executed. We can then execute the instructions in sequence. In case of a computer, the first program (small enough to fit inside the computer’s ROM) can load a boot loader, that will start loading the actual operating system. The boot loader will use the same principles to look for the entry point of the operating system and load it, which will then load another system and the chain will be amplified until the operating systems and it’s services start. Stephen Marz provides an hands-on example of how this is done on a RISC-V machine, for an operating system written in Rust.

Another related idea is hierarchical search paths. What will happen if we have multiple candidates in any stage of the whole process? For example, if the first executable can search for the entry point in multiple storage devices, which one should it look for first and if there are multiple boot loaders present in the system, which one should it pick. The search path is also predefined, but by the user. In case of a computer, if a boot loader is found in the first path, then the search will end immediately. If the boot loader finds multiple operating systems in a computer, it usually let’s the user choose which one to boot.

How can we boot up the computer if the none of the above exist? In early computers, the first short program (i.e., the initial instructions – the opcodes and the arguments) were entered manually using switches of the computer, after turning it on.

The next big ideas are privileged mode and interrupt vector. The boot loader loads the kernel of the operating system in privileged mode(s) of the CPU (the exact name of the mode can be different in different instruction sets). It also loads the interrupt vector that contains all entry points to the operating system in a predefined memory location. Once the kernel of the operating system is loaded, it can be switch to user mode with lower privilege to load other services and programs. It should be noted that the privileged mode(s) and interrupt vectors are CPU features that let a program gain control of the system. Once a program (in this case the operating system kernel) loads the interrupt vector and switch to user mode, request for all privileged operations will be redirected to and executed by the operating system kernel and the latter can always regain control of the system from any program running on the system in user mode. If we want some other program to take control of the system during boot, then we need to overwrite or update the boot loader to point to the new program. If we want a program to override the operating system without rebooting, we would need to do so by finding a security vulnerability in the operating system code that’ll let us overwrite the interrupt vector to point to our preferred memory locations.

An operating system manages computer systems designed based on a specific abstract model of a computer. This abstract model includes an Instruction Set Architecture (ISA) that defines what the CPU can do and how to interact with the CPU. In other words, an operating system is written with a specific computer model or ISA in mind and how we initialize the operating system is dependent on the features of that computer model or ISA.

To summarize, upon turning on a device, we need to start executing instructions from a specific physical address and execute the instructions in sequence. The initial sets of instructions (i.e., the first program) loads a bigger program from another predefined address. All of these goes in a predefined hierarchical order in the privileged mode of the CPU, until the operating system kernel is loaded. Once the operating system takes control of the CPU, it can then switch to less privileged user mode to start different services. ByteByteGo has a nice article describing how Linux operating system boots in a single boot system. For more information about privileged and user mode for RISC-V once can take a look at RISC-V ISA manual. Intel CPUs have multiple privilege modes and complicated privilege switching mechanism due to backward compatibility.

Do other devices (e.g., SSD, GPU, or any complex electronic device) use to same techniques to initialize their internal software (i.e., firmware, operating system or whatever it is)? While booting up the firmware of any hardware inside a computer (e.g., SSD, GPU, network adapter, even the CPU) is simpler, some of above ideas apply. However, most firmware inside these devices may not need to implement complicated interrupt-based privilege escalation/de-escalation similar to the operating system. It is worth investigating whether or not this is true for computational SSDs that can run compute inside the storage device.