The SIGOPS Hall of Fame Award was instituted in 2005 to recognize
the most influential Operating Systems papers that were published at
least ten years in the past. For an in-depth discussion of the SIGOPS
policy for this award, please read Jeff
for the SIGOPS Hall of Fame Award.
SIGOPS members can send in their nominations by e-mail
to HOFnominations (at)
sigops.org. The Hall of Fame Award Committee will choose which
nominated paper wins the award. The decision will be based on a
discussion that considers the impact the paper (and more generally of
the research described in the paper) has had on the field of operating
systems research. The Award committee will prepare a short statement
that describes why the paper was selected.
The award winners will be announced at the SOSP or OSDI conference
by the current program chair. The program chair will read the
statement prepared by the Award committee that describes why the paper
was selected. The authors of the award winning paper will be given a
plaque, naming the paper, the authors, the conference or journal the
paper appeared in, and the conference in which the award was
made. This certificate will be signed by the program chair and the
current chair of SIGOPS. A list of the winners of the award will be
maintained on the SIGOPS website.
The Hall of Fame Award Committee consists of past program chairs from
SOSP, OSDI, EuroSys, past Weiser and Turing Award winners from the
SIGOPS community, and representatives of each of the Hall of Fame
Award papers. Five members from the committee, chosen to be without
conflict of interest with the possible award winners, do the final
- Daniel G. Bobrow, Jerry D. Burchfiel, Daniel L. Murphy and Raymond S. Tomlinson. Tenex, A Paged Time Sharing System for the PDP-10 Communications of the ACM 15(3), March 1972.
The Tenex system pioneered many ideas that are prominent in modern operating systems. It included one of the first page based memory systems, copy on write sharing, mapping of files into virtual memory, and user/group/other file protection. It also had mnemonic commands with command-line completion and automatic file versioning. As one reviewer said, "Reading it now, I'm pleasantly surprised by how much is familiar --- thanks to its successors."
- Joel Bartlett. A NonStop Kernel in Proceedings of the Eighth ACM Symposium on Operating Systems Principles (SOSPí81), Pacific Grove, California, December 1981.
Tandem was the first commercial database to achieve fault tolerance. To accomplish this, the Tandem system had to bring together many techniques --- including message-passing, mirroring, fast failure detection, and failover --- into a practical design and implementation.
- K. Mani Chandy and Leslie Lamport.
Distributed Snapshots: Determining Global States of a Distributed System
ACM Transactions on Computer Systems 3(1), February 1985.
This paper takes the idea of consistency for distributed predicate evaluation, formalizes it, distinguishes between stable and dynamic predicates, and shows precise conditions for correct detection of stable conditions. The fundamental techniques in the paper are the secret sauce in many distributed algorithms for deadlock detection, termination detection, consistent checkpointing for fault tolerance, global predicate detection for debugging and monitoring, and distributed simulation.
- Kenneth P. Birman and Thomas A. Joseph.
Exploiting Virtual Synchrony in Distributed Systems
in Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (SOSPí87), Austin, Texas, November 1987.
This paper describes a methodology for building distributed applications comprised of multiple components, each realized by a group of replicated servers. It defines a number of group communication primitives and then ties fault notification into the fabric of group services by introducing the virtual synchrony principle, which orders communication and fault notifications consistently among group members and across multiple groups.
- Eddie Kohler, Robert Morris, Benjie Chen, John Jannotti and Frans Kaashoek.
The Click Modular Router
ACM Transactions on Computer Systems (TOCS), 18(3), August 2000.
Click defines a simple, modular, and efficient framework for constructing network routers with different services and properties. Since this paper's publication, Click has been an essential tool for the networking and systems research communities with dozens and perhaps hundreds of systems and papers built on it, including several commercially successful systems.
- Brian M. Oki, Barbara
H. Liskov. Viewstamped
Replication: A New Primary Copy Method to Support Highly-Available
Distributed Systems Proceedings of the Seventh Annual ACM
Symposium on Principles of Distributed Computing (PODC 1988), Toronto,
ON, Canada, Aug 1988, pp 8--17.
The paper introduces a replication protocol very similar to what is
now known as Paxos. That protocol has become the standard for
consistent, fault-tolerant state-machine replication, and is widely
used in data centers to keep the state consistent despite failures and
Lamport. The Part
Time Parliament ACM TOCS 16(2), May 1998, 133--169.
The work (originally published in 1989) was independent and roughly
concurrent with the Viewstamped Replication work also recognized this
year. It describes the protocol in a more general setting, adds a
correctness argument, and forms the basis for modern Paxos
- Kai Li, Paul
Coherence in Shared Virtual Memory Systems ACM TOCS 7(4), Nov
1989, pp 321--359.
The paper shows how to simulate coherent shared memory on a cluster,
and also introduces directory-based distributed cache-coherence. It
spawned a entire research area, and introduced cache coherence
mechanisms that are widely used in industry.
- Mendel Rosenblum, John
K. Ousterhout. The
Design and Implementation of a Log-Structured File System ACM TOCS
10(1), Feb 1992, pp 26--52.
The paper introduces log-structured file storage, where data is
written sequentially to a log and continuously de-fragmented. The
underlying ideas have influenced many modern file and storage systems
like NetApp's WAFL file systems, Facebook's picture store, aspects of
Google's BigTable, and the Flash translation layers found in SSDs.
Do Computers Stop And What Can Be Done About It? HP Labs Technical
The paper presents the first large scale quantitative study of
computer failures in practice, of a system built using best practices
at the time to achieve fault-tolerance.
on Trusting Trust. Communications of the ACM, Volume 27 Issue 8,
The paper demonstrated that to have trust in a program, one cannot
just rely on trust in the person who wrote it, or even on verifying
the source code. One must also ensure that the entire tool chain used
to produce and execute binaries is trustworthy.
- Jack B. Dennis, Earl C. Van
Semantics for Multiprogrammed Computations. Communications of the
ACM, Volume 9 Issue 3, March 1966.
The paper lays out the conceptual foundations for multiprogramming
and protection in computer systems.
David A. Patterson, Garth Gibson, Randy
H. Katz. A Case for
Redundant Arrays of Inexpensive Disks (RAID). Proceedings of the
1988 ACM SIGMOD International Conference on Management of Data.
The paper shows how to achieve efficient, fault tolerant and highly
available storage using cheap, unreliable components.
- Roger Needham and Michael Schroeder. Using Encryption for Authentication in Large Networks of Computers. Communications of the ACM, December 1978.
- Butler Lampson and Howard Sturgis. Crash Recovery in a Distributed Data Storage System. Technical report, Xerox Palo Alto Research Center, June 1979.
- Jim Gray, Paul McJones, Mike Blasgen, Bruce Lindsay, Raymond Lorie, Tom Price, Franco Putzolu, and Irving Traiger. The Recovery Manager of the System R Database Manager. ACM Computing Surveys, June 1981.
- Cary G. Gray and
David R. Cheriton, Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency, Proceedings of the Twelfth ACM Symposium on Operating Systems Priciples (SOSP), December 1989, Litchfield Park, AZ, USA.
The Gray and Cheriton paper pioneered through its analysis of the Leases mechanism, which has become one of the most widely-used mechanisms for managing distributed caches. The paper is particularly striking for its careful analysis of the semantics of leases, its detailed experiments, and its thoughtful discussion of fault-tolerance issues.
- Butler W. Lampson and
David D. Redell,
Experience with processes and monitors in Mesa, Proceedings of the Seventh ACM Symposium on Operating Systems Priciples (SOSP), December 1979, Pacific Grove, CA, USA.
When this paper was written, monitors had emerged as the synchronization method of choice. in programming languages conferences and operating systems textbooks. This paper was the first to look closely at the practical issues that monitors pose when used in a large production system. These issues remain contemporary, and indeed researchers working on transactional memory mechanisms would do well to reread this wonderful paper.
- Nancy P. Kronenberg,
Henry M. Levy,
and William D. Strecker,
VAXclusters: A Closely-Coupled Distributed System, Proceedings of the Tenth AMC Symposium on Operating Systems Principles (SOSP), December 1985, Orcas Island, USA, USA.
The VAX Clusters system was the first modern clustered system supporting such basic features as a distributed file system and a distributed locking service. The SOSP paper on VAX Clusters remains a classic today. VAXclusters was a huge commercial success, and set the stage for today.s massive data centers.
- John H. Howard, Michael L. Kazar, Sherri G. Menees, David
A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, and Michael J. West, Scale and
Performance in a Distributed File System, Proceedings of the Eleventh ACM Symposium on Operating
Systems Principles (SOSP), November 1987, Austin TX, USA.
- Richard Rashid, Avadis Tevanian, Michael Young, David Golub, Robert
Baron, David Black, William Bolosky, and Jonathan Chew, Machine-Independent
Virtual Memory Management for Paged Uniprocessor and Multiprocessor
Architectures, Proceedings of the Second Architectural Support for Programming Languages and
Operating Systems (ASPLOS), October 1987, Palo Alto CA, USA.
- Andrew D. Birrell, Roy Levin, Roger M. Needham, and Michael D. Schroeder, Grapevine: An Exercise
in Distributed Computing, Proceedings of the Eighth ACM Symposium on Operating Systems Principles
(SOSP), December 1981, Pacific Grove CA, USA.
- Edouard Bugnion, Scott Devine, and Mendel Rosenblum, Disco: Running
Commodity Operating Systems on Scalable Multiprocessors, Proceedings of
the Sixteenth ACM Symposium
on Operating Systems Principles (SOSP), October 1997, Saint Malo, France.
- Andre Bensoussan, Charlie T. Clingen, Robert C. Daley, The Multics Virtual
Memory: Concepts and Design, Communications of the ACM 15(5):308-318, May 1972.
- Leslie Lamport, Time,
Clocks, and the Ordering of Events in a Distributed System, Communications of the
ACM 21(7):558-565, July 1978.
Perhaps the first true "distributed systems" paper, it introduced the concept
of "causal ordering", which turned out to be useful in many settings. The paper
proposed the mechanism it called "logical clocks", but everyone now calls these
- Andrew D. Birrell and Bruce Jay Nelson, Implementing
Remote Procedure Calls, ACM Transactions on Computer Systems 2(1):39-59, February 1984.
This is the paper on RPC, which has become the standard for remote
communication in distributed systems and the Internet. The paper does an excellent job
laying out the basic model for RPC and the implementation options.
- J. H. Saltzer, D. P. Reed, and D. D. Clark, End-To-End
Arguments in System Design, ACM Transactions on Computer Systems 2(4):277-288, November
This paper gave system designers, and especially Internet designers, an elegant
framework for making sound decisions. A paper that launched a revolution and, ultimately,
- Michael Burrows, Martin Abadi, and Roger Needham, A
Logic of Authentication, ACM Transactions on Computer Systems 8(1):18-36, February 1990.
This paper introduced to the systems community a logic-based notation for
authentication protocols to precisely describe certificates, delegations, etc. With
this precise description a designer can easily reason whether a protocol is correct
or not, and avoid the security flaws that have plagued protocols. "Speaks-for"
and "says" are now standard tools for system designers.
- Fred B. Schneider, Implementing
Fault-Tolerant Services Using the State Machine Approach: a tutorial, ACM Computing
Surveys 22(4):299-319, December 1990.
The paper that explained how we should think about replication ... a model that turns
out to underlie Paxos, Virtual Synchrony, Byzantine replication, and even Transactional
- George C. Necula and Peter Lee, Safe
Kernel Extensions Without Run-Time Checking, Proceedings of the Second
USENIX Symposium on Operating Systems Design and Implementation, October 1996, Seattle,
This paper introduced the notion of proof carrying code (PCC) and showed how it could
be used for ensuring safe execution by kernel extensions without incurring run-time overhead.
PCC turns out to be a general approach for relocating trust in a system; trust is gained in
a component by trusting a proof checker (and using it to check a proof the component
behaves as expected) rather than trusting the component per se. PCC has become one of the
cornerstones of language-based security.
- Edsger W. Dijkstra, The
Structure of the THE Multiprogramming System, Proceedings of the First
ACM Symposium on Operating Systems Principles, October 1967, Gatlinburg, TN, USA.
The first paper to suggest that an operating system be built in a structured way.
That structure was a series of layers, each a virtual machine that introduced abstractions
built using the functionality of lower layer. The paper stimulated a great deal of
subsequent work in building operating systems as structured systems.
- Peter J. Denning, The
Working Set Model for Program Behavior, Proceedings of the First
ACM Symposium on Operating Systems Principles, October 1967, Gatlinburg, TN, USA.
This paper introduced the working set model, which has became a key concept in
understanding of locality of memory references and for implementing virtual memory. Most
paging algorithms can trace their roots back to this work.
- Dennis M. Ritchie and Ken Thompson, The
UNIX Time-Sharing System, Proceedings of the Fourth
ACM Symposium on Operating Systems Principles, October 1973, Yorktown Heights, NY, USA.
At a time when operating systems were trending towards complexity, UNIX emerged as a
hallmark of elegance and simplicity.
- Butler Lampson, Hints
for Computer System Design, Proceedings of the Ninth
ACM Symposium on Operating Systems Principles, pp. 33-48, October 1983, Bretton Woods,
A classic study of experience building large systems, distilled into a cookbook of
wisdom for the operating systems researcher. As time has passed, the value of these hints
has only grown and the range of systems to which they apply enlarged.