OS Kernels
Summarized by Kip Walker, CMU


Integrating Segmentation and Paging Protection for Safe, Efficient and Transparent Software Extensions
Tzi-cker Chiueh, Ganesh Venkitachalam, and Prashant Pradhan (State University of New York at Stony Brook)

Tzi-cker Chiueh presented their paper, in which the segmentation and paging hardware in the x86 family of processors was used to provide memory protection from kernel- or user-level extensions. Most of the talk was spent on an explanation of the underlying hardware mechanism, and how it had been applied to memory protection for extensions. The evaluation showed that protected calls into extensions were almost as fast as unprotected calls.

A lengthy Q&A session began with Jochen Liedtke commenting that details of the lret/lcall instructions could be found in Intel's OS-writer's guide. The first question came from Jeff Chase, who inquired if work had been done to address protection issues unrelated to memory, such as preventing infinite looping in an extension. Chiueh responded that the focus of the work was on memory protection; however, a basic timer mechanism had been implemented to prevent an extension from stealing the CPU forever.

An unidentified audience member asked if extensions could be protected from each other without expensive operations. Chiueh's answer was that the same efficient mechanism could probably not be used for memory protection between extensions. The next question, from Stefan Savage, pursued the idea that memory protection is just a part of the solution in supporting safe extensions. He observed that in his experience, the majority of problems were subtler, such as invariants being broken as a a side-effect of crossing interface boundaries. Again, Chiueh responded that memory protection was the focus of the work, and that in his group's limited experience with building extensions, such a situation had not occurred.

In an effort to get things stirred up, Geoff Kuenning asked what the speaker had learned about "real computers" from this project. Chiueh nimbly deflected this by pointing out that x86 processors are rather common. The last question came from Jonathan Shapiro, who inquired why such a mechanism is useful when Liedtke and others have demonstrated such fast context switch times. Shapiro let the question go unanswered with the comment, "Put up or shut up is a fine answer."

For more information, see http://www.ecsl.cs.sunysb.edu/palladium.html.


Cellular Disco: Resource Management Using Virtual Clusters on Shared-Memory Multiprocessors
Kinshuk Govil, Dan Teodosiu (HP Labs), Yongqiang Huang, and Mendel Rosenblum (Stanford)

Kinshuk Govil, a Stanford student, presented this award paper. Their work builds on Disco, a virtual machine monitor designed to allow commodity OSs to run on large SMPs. Cellular Disco brings scalable resource management and fault containment to Disco. Virtual machines run on a subset of "cells" in the SMP, and hardware faults disrupt only the VMs using resources in affected cells. Scalable resource management allows the full resources of the SMP to be used even if the operating systems running on the machine are not scalable.

Questions started with Ken Birman asking how faults are detected by the virtual machines, and how failures appear to applications. Govil responded that faults must be detected and contained by the hardware; the virtual machine monitor then shuts down all virtual machines that depend on any resources from cells affected by the faults. While it might be possible to do something interesting with alerting the operating systems to hardware failures, the designers of Cellular Disco wanted to modify the operating systems as little as possible. While software faults will take down a particular VM, neighboring virtual machines will run undisturbed due to the isolation provided by the underlying virtualization of the machine.

An unidentified questioner requested Govil's opinion of the following alternative design: if the operating system source were available, one could consider offering a different API to the OS rather than simply virtualizing the hardware's interface. Again, the response was that changing the operating system was not considered to be an option, so nobody had investigated such a possibility. Mike Swift inquired if the kernel text segments were shared across processors, and if the number of virtual CPUs known to one of the operating systems could be changed mid-execution. Govil answered that the kernel code was indeed shared, and that changing the number of VCPUs was not possible without restarting the virtual machine. The final question followed up on the idea of operating system modifications. Jonathan Appavoo asked how much knowledge of the OS had been needed to design Cellular Disco, and what changes to the OS had been necessary. Govil responded that the main changes consisted of instrumenting the idle loop and memory allocation routines. He further noted that there are ways to achieve the same functionality in the virtual machine monitor, without touching the OS.

For more information, see http://www-flash.stanford.edu/~kinshuk/.


EROS: A Fast Capability System
Jonathan S. Shapiro, Jonathan M. Smith, and David J. Farber (Univ. of Pennsylvania)

After warming up the crowd with a "letter home from Camp SOSP" slide, Jonathan Shapiro, now at IBM T.J. Watson Research, presented this paper. Shapiro began the talk by arguing that in addition to performance, system designs must pay attention to security, integrity, high availability, fault tolerance, and evolvability. EROS achieves some of these goals through its use of capabilities as its fundamental building block. All resource access is accomplished through invocation of capabilities. The system keeps overheads low by using well-chosen abstract objects and caching techniques.

Jochen Liedtke brought up the observation that with a persistent system, even faults may end up being preserved. Shapiro responded that in his experience, faults had only come from active development on the system and that he had never seen bad state make it to the disk. A consistency checker in the kernel was designed to catch problems before they get checkpointed. Drew Dean commented that the work seemed to focus solely on discretionary security, and wondered if that implied that EROS was not interested in mandatory, multi-level security. Shapiro answered that mandatory security was certainly of concern, but EROS was intended for exploring how more discretionary control could be given to users.

Satya closed the session with a well-phrased inquiry about what Shapiro saw as applications needing EROS (as well as capabilities in general). With the persistence architecture and very fast transaction facility, EROS could be applied to advanced databases. Shapiro also described a "stock negotiator" as possible application, where many buyers are trying to coordinate a stock price. Regarding the benefits of capabilities, Shapiro argued that object systems aren't "active"--that when persistence and controlled communication between objects is desired, capabilities are a good solution.

For more information, see http://www.eros-os.org/.