At more than 150 pages, this is a long paper. I will be summarizing the important and interesting ideas that I've read so far. This dissertation was written in the same timeframe as the Mach and L3 papers. It cites the Mach design, and like the L3 paper it is highly critical of the poor performance of Mach. It came before the L3 paper, so there is no direct comparison between the two. Unlike L3 though, Synthesis aims to improve the overall performance of the OS, rather than just IPC. Modern operating systems aim to improve throughput by buffering, which increases latency. Synthesis claims to be able to improve both throughput and latency using some incredibly extreme ideas. The first thing to note is that the Synthesis kernel is written entirely in Motorola 68030 assembly. The design of the kernel is, perhaps, at the extreme end of the tradeoff of performance versus portability. Because of this, very few of Synthesis' ideas have been used in modern systems. In a nutshell, Synthesis optimizes it's own code by performing runtime code generation and optimizations that you would expect to see in a compiler: constant folding, constant propagation, and procedure inlining. This allows the kernel to optimize it's own procedures using techniques like factoring invariants, collapsing layers, and executable data structures. Examples of executable data structures are: * A task scheduling queue that contains the code for pausing/resuming tasks. Consider a task that yields control to the kernel. Execution will jump into the pause code in the queue, which has a pointer to the resume code of the next task in the queue. * A buffer object that has the buffer pointer integrated in. Synthesizing the data structure code at runtime maximizes use of information available at runtime: comparison and branch instructions can be avoided. Combined together, these dynamic methods improve all aspects of kernel performance, increasing throughput without increasing latency of small jobs. The maintainability of such a system should be very low. The primary abstract data type, the quaject, is the basic building block for all of the kernel's services. The interfaces defined by quajects should determine how easy the kernel is to extend. I have not yet had enough time to read their description and decide how they affect maintainability. Furthermore, Synthesis optimizes some other aspects of OS kernels: scheduling and concurrent access to data structures. It uses a fine-grained, self-tuning scheduler that achieves real-time bounds for high priority tasks. Synchronization is done in an optimistic lock-free manner. The synchronization makes heavy use of the Compare-And-Swap instruction, building on previous work[1]. Speed is achieved by choosing the lock-free structures that are most efficient (stacks, queues, and linked lists) and building the kernel with them. Use of the 68030's two-word Compare-And-Swap help to speed up the implementation of these lock-free structures. If "real" synchronization is required then the job is either broken down so that it can be composed of more than one lock-free structure, or, in rare cases, a "service thread" reads from a lock-free queue where requests for a resource are queued up. 1 - P.M. Herlihy. Wait-Free Synchronization. ACM Transactions on Programming Languages and Systems 13(1), January 1991.