23. Memory management before paged virtual memory#
23.1. Simple physical memory management.#
From a hardware perspective, the simplest memory management model is to have processes just use the physical address space; where instructions executed on the processor directly reference physical memory addresses. For example, if the application wants to load a register, from location 0x1234
it would get the data stored at physical memory location 0x1234
. For example, the original Microsoft DOS operating system, supported exactly this model.
As shown in Fig. 23.1, the operating system needs to be loaded into one part of the memory, and each program that you would want to run needs to be loaded into different physical memory at different addresses.
There are two major problems with physical memory management. First, there is no protection between the different programs. A bug in program 1 may cause program 2 to be modified, or even modify the operating system.
Second, programs need to be re-located when they are loaded. If we are going to run multiple programs, we can’t know a-priori the location where the program will be loaded. However, when the program is compiled and linked into object code specific addresses, for example for branches needed to be specified in the program.
For this reason programs are usually linked as relocatable. Relocatable programs have a table of all addresses within the program image that must be changed at load time before it can be run. See Fig. 23.2, where the jump instruction at address 0x0000
needs to be modified by adding the address of where the program is being loaded, i.e., JMP 0x1000
is modified to JMP 0x3000
.
While we don’t need relocation to solve this problem with paged virtual memory, as we will discuss layer, relocation is still used today for shared libraries and for address space randomization.
23.2. Segmentation - Virtual Memory#
Physical memory managment has the 2 problems; there is no isolation and you need to relocate programs to wherever they are loaded. What we really want is virtual memory where every process has an address space that appears to start at address0x0000
and ends at some large address based on underlying hardware design. This way every process is isolated from every other process because it can only see its own virtual memory and every program can be loaded at wherever it was linked to run without relocation.
23.2.1. Segmentation - Base and Bounds translation#
We first looked at direct physical addressing, where no matter which process is executing, the same address (e.g. 0x1000) refers to the same memory location. In addition we reviewed a very simple form of address translation, shown here in Fig. 23.5
where base and bounds registers are used to relocate a section of the virtual address space—the addresses seen by the program, corresponding to values in the CPU registers—to somewhere else in the physical address space. By changing these translations the operating system can create multiple virtual address spaces, one per process; however there is still only one physical address space, uniquely identifying each byte in each memory chip. In this chapter we introduce paged address translation, a more complex address translation mechanism used by most modern CPUs, and present the 32-bit Intel implementation as an example.
23.3. Segmentation(AKA base and bounds translation).#
In the simplest form of segmentation the hardware provides 2 registers that are loaded each time a process acquire the CPU, a base register and a limit register. For a given process the base register contains the physical address that the program was loaded at and the limit register contains the size of the program that was loaded into memory. Every process has a virtual address space starting at zero and a size determined by the actual program size that the process is running. The process specific base and limit registers which are loaded every time a process acquires the CPU establishes the bounds of the virtual address space for every process. For each and every memory reference the hardware adds the virtual address to the base register to determine a physical address and insures that the physical address is between the base register and base register plus the limit register. If it is outside those bounds the program is terminated with an illegal virtual memory reference error. Segmentation solves both the lack of protection and the mandatory relocation requirements of physical addressing. Segmentation has little or no performance overhead because the hardware performs the virtual to physical translation or the addition of the virtual address and the base register to determine every physical address.
23.3.1. Single segments per address space#
So far we discussed a segmentation implementation that provides one base register and one limit register in hardware and one of each of those process specific values that gets loaded into those registers when context switching to a given process. Since there is only one of each register, the entire process virtual address space must be physically contiguous and all text, data and stack must be within that single memory region. While this is a huge improvement over a physical memory model it limits the size of the virtual address space to being static and not expandable. There is no way to dynamically increase the size of the text, data or stack regions of a process at run-time, everything must be allocated in advance. This requires allocating physical memory that might never be used.
Since every process has different base and limit register values segmentation can easily support multiple processes.
23.3.2. Multiple segments per address space.#
As mentioned earlier a single base and limit segment register implies that an entire process virtual address space is a one physically contiguous region of physical memory mapped into one virtually contiguous virtual region of virtual memory. This means that the text, data and stack regions must be packed tightly together in both physical and virtual memory unless we are willing to waste both physical and virtual memory. Also, with only one segment register its not possible to offer different types of protections for the various regions of the virtual address space. In other words all of virtual memory must be readable, writable and executable since data must be both readable and writable and text must be executable. It would be nice to prevent data regions from being executable and text regions from being readable and writable for security and debug optimizations.
This can be achieved by the hardware implementing multiple segment and limit registers with only specified permissions for text, data and stack regions and having the operating system use those registers when context switching to a process. When mapping the text into a virtual region the operating system can specify an execute only region that does not have to be adjacent to other non-executable regions. When mapping data into virtual memory the operating system can specify read/write only thereby preventing execution of data regions. Finally the stack can also be non-executable but also the operating system can move the virtual memory stack region away from any other region making it easier to debug common programming problems like stack overflows. Finally multiple segment registers eliminates the necessity for the text, data and stack regions to be physically contiguous. This allows a program to be split up into multiple smaller regions both physically and virtually making it much easier to hold more programs in physical memory at the same time.
23.3.3. Private versus Global segments.#
Every process virtual address space consists of 2 types of regions, Private and Global. The private regions for a process are the program specific text, data and stack regions that the process is running. The global regions include the operating system that is and must be mapped into every process address space. As we discussed earlier in this course the operating system consists of all the software that executes on behalf of the currently running program as well as basic system overhead that runs on behalf of the system. This includes all the system calls the operating system supports. Since every process must map the operating system kernel it is shared between every running process rather than each process containing a separate copy. The global segment registers are used to map the shared kernel text, data and kernel stack area in every process and at the same virtual addresses. When a context switch occurs only the private segments registers are changed for the newly running process, there is no need to change the global segment registers since they are identical for every process.
23.3.3.1. Fragmentation and Compaction.#
When a new process is created and a program runs the kernel reads the program text, data and stack memory into the available or free physical memory locations. From there the private segment registers are use to map that physical memory into the private virtual address space of the process, allowing the process to run the program. When a process exits, the physical memory regions that the process consumed is made available or freed onto a physical memory free list. Over time as processes are created, run and exit the physical memory becomes more and more fragmented. After a while as processes come and go its likely that the sum of available physical memory is large enough to satisfy a request but there is no physically contiguous free memory region large enough to hold the request.
For example if memory is allocated and de-allocated in chunks of different sizes and at different times, then it can become fragmented so that even if large amounts of memory are free, it will be divided into smaller fragments, separated by longer-lived small allocations, as seen in (#fig:vm:fig2).
Start: 32 locations, all free
Step 1, 2: Load and run processes A and B
Step 3, 4, 5: Run and load processes C, D and E.
Steps 6, 7, 8: processes A, C and D terminate
In the last line, you can see that only 2 units of memory (out of 32) remain allocated, but the largest amount that can be allocated at one time is 10 units. If all allocation requests are small, this might not be a problem; however, in an operating system it is common to have one or two very large processes (e.g., a web browser and word processing software), and many small, long-running processes (e.g., the on-screen battery display or wifi signal strength indicator). In this case, large memory allocations may fail, even when there is enough total memory free, because long-lived small allocations fragment the available contiguous memory into smaller pieces.
When this happens the operating system must move or coalesce the used memory regions together thereby creating a large contiguous available or free region. Even though this is very time consuming and not desirable at least now one or more requests can be satisfied. This is all made possible because the base registers of the processes that map these moved regions can be updated to the new locations of the physical memory and because the operating system has the ability to relocate programs in physical memory as we discussed in the physical memory management model. Again, this is time consuming and undesirable.
23.3.3.2. Swapping.#
Since the entire process must be resident in physical memory to run with segmentation, its unlikely that all processes will be able to run at the same time. In order to support more processes than can fit into memory the memory management system must swap entire processes out to disk when memory is needed and into memory from disk when they need to run again.
The memory management system has determined that process 1 had run long enough so it swaps it out and frees up memory it occupies.
The memory management system has determined that its time for process N to run again so it swaps it into the memory freed by the swapout.
23.3.3.3. Limitations of segmentation.#
Segmentation, especially with multiple segment registers along with private and global segment registers provides a huge benefit over a physical memory management model. We can now support many processes running at the same time with protection between processes and even protection within a process. However Segmentation still has major weaknesses, 1.) the memory for a segment must be physically contiguous, 2.) the virtual address size can never exceed the physical address size with segmentation. Needing physically contiguous memory is a huge problem, it requires compaction and swapping in order to run a significant number of processes simultaneously. Limiting programs to less that the amount of physical memory on a given system is also very limiting. It would be very convenient to be able to allocate a very large sparse region of virtual memory and only actually use a small subset of it. Imagine allocating an array of fixed size records for every possible student at Boston University and indexing that array by the student’s social security number. There are 10^9 or 1 Billion social security numbers but only a few thousand students at BU. Such a large sparse array could not be implemented with segmentation unless the system actually had all the necessary physical memory for every possible social security number. Imagine being able to map millions of files in a virtual address space on a system that didn’t have all that much physical memory.