16. Disk Layout:Tracking Free Space#
So, we know how to keep track of blocks used in a file, how do we represent blocks that are free?
If you recall the approach used in FAT file system to represent used blocks, the same data structure was also used to represent free blocks. When allocating a block, the file system needed to search through the table until it found an entry marked as free—an extremely inefficient approach.
The original Unix file system used a free list to store a list of unused blocks (see Fig. 16.1). Each block on the list was filled with free disk block numbers, so that a large number of free blocks could be found by reading a single block of the free list into memory. Blocks were allocated from the head of this list for new files, and returned to the head when freed. Implementing this idea results in a large initialization cost. If you have a 1 TB drive with 4 KB blocks, it means you will start off with 256 M blocks on the free list. Each 4 KB block can hold a list of 1024 32-bit free block numbers, so we will need 256 K blocks to hold the initial free list. However, it compresses to a small number of blocks for the free list when the disk becomes full.
The bigger disadvantage of this approach was that, as files were created and deleted this list became randomized, so that blocks allocated for a file were rarely sequential and disk seeks were needed for nearly every block read from or written to disk. Originally this wasn’t a significant problem, because early Unix systems ran on machines with fast disks and excruciatingly slow CPUs.
As computers got faster and users started noticing that seek times were killing performance, file systems started using an allocation bitmap as shown in Fig. 16.2. It keeps a boolean array with one bit for each disk block; if the block is allocated the corresponding bit is set to ‘1’, and cleared to ‘0’ if it is freed. To allocate a block, you read a portion of this bitmap into memory and scan for a ‘0’ bit, changing it to ‘1’ and writing the modified block of the bitmap back to disk. When you extend a file you begin the search at the bit corresponding to the last block in the file; in this way if a sequential block is available it will be allocated. This method results in files being allocated in a mostly sequential fashion, reducing disk seeks and greatly improving performance. Note, when disk is fragmented, each block of the bitmap may have just a few bits available, so this approach becomes expensive to find a free block. However, since a single page of 4096 bytes can represent 132 MB (\(4096*8*4096\)), you can efficiently cache these blocks in memory.