Exec Lists
From MorphOS Library
Grzegorz Kraszewski
Contents
Introduction
A list is the simplest dynamic data structure. An array is even simplier, but it is not dynamic. Adding a single item to an array requires reallocation of memory and copying the whole old contents, similarly for removing. Inserting and removing elements to a list is very fast operation and its computational costs do not increase with number of items. Because of this, list is the basic structure used in exec.library, the MorphOS kernel. One could say the Exec is built on lists. Lists are also used through all the system to manage processes, windows, screens, devices, DOS volumes, fonts, libraries and more. Of course lists are also used by applications to manage dynamic data. Many more sophisticated data structures are built on lists. For all these reasons understanding lists and their MorphOS flavour is essential for every programmer.
Lists may be divided to intrusive and nonintrusive ones. An intrusive list is one requiring that every list item contains a part called a node. The node is then used for linking items into list. A nonintrusive list creates nodes itself. Both kinds have their advantages. Exec lists are intrusive. Why? For a nonintrusive list adding an item means allocating memory for its node. Exec lists are often used at really low levels of the system, like interrupt handling, process scheduler or input/output hardware devices. Calling memory allocator from there is unacceptable. Also error handling would become a nightmare (every memory allocation can fail...). In some parts of the system lists also have to be extremely fast. Memory allocation is a complex operation and can't be expected to finish in a few processor cycles. Of course on higher levels of the system, intrusiveness has more disadvantages than advantages. For example an item of intrusive list cannot be added to more than one list. Then high level components (for example MUI List class) may be nonintrusive ones.
From a Plain List to Exec List
The simplest form of list is a unidirectional one. Node of every item consists only of pointer to the next element. The list is identified by keeping a pointer to its first item. Unidirectional lists however are used only in special cases. While inserting an element at list start is easy, inserting at end requires traversing the whole list. Time of this operation increases linearly with number of items. It is also not possible to traverse the list in the backward direction. Then bidirectional lists are much more common. Exec lists are bidirectional. Every node contains two pointers: one to the next element and one to the previous element. For the first element pointer to the previous one is NULL. For the last one, pointer to the next one is NULL. If a list user keeps pointers to both the first and the last element, she has symmetrical access to the start and the end of the list.
While the idea is still simple, we have to complicate it a bit. As mentioned before, the list user keeps track of it by maintaining two pointers: to the first and the last item. These pointers will be modified every time an item is added at the start or end of the list respectively. But what if a list has multiple users not knowing about each other? Adding an item at the start or at the end would require pointer update for all the users. To solve the problem, two artificial items are introduced: a list head and a list tail. Those two elements do not carry payload, only consist of a node. Their location does not change during the list lifetime so they work as anchors. Inserting an element at list start is in fact inserting it between the list head and the former first element. Similarly element inserted at end is in fact inserted between the former last element and the list tail. As pointers to the head and the tail are constant, they may be shared between multiple users.
Exec lists combine the list head and the list tail into one structure, named list header. As the Exec was designed in days when computer memory was counted in kilobytes rather than gigabytes, designers saw a way to save a few bytes. The "previous" field of list head node always contains NULL. Similarly the "next" field of the list tail always contains NULL. Then they can be merged into one. That is why the list header contains only three pointers instead of four. For the C language the list header is defined as struct List and lists are commonly referenced by a pointer to it.
