Distributed Systems (II)
Summarized by Jon Howell, Dartmouth College


The Design and Implementation of an Intentional Naming System
William Adjie-Winoto, Elliot Schwartz, Hari Balakrishnan, and Jeremy Lilley (MIT)

Balakrishnan described the Intentional Naming System (INS), the goal of which is resource discovery in dynamic and mobile networks. The design goals included expressiveness, responsiveness, robustness, and easy configuration. They achieve expressiveness with a language that lets applications express what they want, not where it is. Responsiveness is accomplished by routing individual messages by name (late binding). INS is robust due to its serverless, decentralized design, and easy to configure because its resolver nodes (that form the overlay network that implements the service) self-configure. Applications are offered two services: intentional anycast, which delivers a message to a single service with a matching name, and intentional multicast, which delivers a message to every reachable service with a matching name.

Petros Maniatis from Stanford asked why INS binds the name service to the delivery service, citing the overhead of a long name in each packet. Balakrishnan's replied that it allows the system to track mobility. "I could be arbitrarily mobile. I unplug from the Ethernet, use my wireless connection, and everything keeps working." He said that names were on the order of 100 bytes, and that he didn't see that as an overhead. All mobility systems depend on indirection, he said, and how you do indirection is important. We should avoid building unneeded infrastructure.

Satya, CMU, asked why the group built their own tree structure, rather than using an embedded relational database, which would have been much more general. Balakrishnan answered that they wanted something simple that would run on a device as simple as a Palm Pilot. Satyanarayanan predicted that the tree structure would become a restriction in the near future. Balakrishnan replied that they were adding more expressive operators, and argued that constraining the programmer was not always a bad thing. The developers asked themselves, what is the minimum set of operators needed to do something useful?

Dan Wallach from Rice asked about authentication in the presence of dynamism. He pointed out that we can be (somewhat) assured that the "yahoo" site is unique thanks to "the great big DNS root in the sky." The answer was that security was not a goal; Balakrishnan suggested that perhaps self-certifying names (the SFS paper from an earlier session) would be a solution.

Mike Swift from the Univ. of Washington asked why the messages are routed all the way through the overlay network when the first resolver knows the network address of the final destination. Balakrishnan said that this is exactly what happens in the anycast case; forwarding through the resolvers is used for multicast delivery.

For more information, see http://wind.lcs.mit.edu/.


Design and Implementation of a Distributed Virtual Machine for Networked Computers
Emin Gün Sirer, Robert Grimm, Arthur J. Gregory, and Brian N. Bershad (Univ. of Washington)

Sirer presented a new implementation architecture for the Java Virtual Machine (JVM) that allows JVM services to be factored out and distributed across the network. This puts fewer demands on the client architecture (enabling "JVM on a lightbulb"), provides physical separation of the services, and makes client systems more manageable by locating some services on a common server.

Services amenable to factoring include bytecode verification, security policy enforcement, and bandwidth optimization. Code verification at a server saves substantial resources on the client. Maintaining security policies on a central server is attractive because it makes site security more manageable. Bandwidth optimization involves profiling runs of the application, then sending future clients only the commonly-used code, with stubs that load the rest of the code on demand.

The factoring design involves two phases: the first is to inspect class files a priori, and the second to dynamically rewrite class files to inject code snippets into the application. The injected code has access to data- and context-dependent values on the client.

Ed Felten from Princeton asked how the group could improve security by moving the verifier: "you have added stuff to the trusted computing base: the network channel, the server, the server operating system. . . ." Sirer agreed, but said that moving the verifier addresses operational problems. He compared site-wide policy enforcement to firewalls, in that it is easier to secure a single, well-placed host. Felten argued that site-wide configuration depended on the clients having this factored JVM; and that if you could ensure that the client has the right software to begin with, the problem would already be solved. Sirer pointed out that with central management, an administrator has only to do O(1) work to update the verification procedure for all of the clients.

Fred Schneider from Cornell mentioned that binary rewriting dates from the 1970's. He said that he had found binary rewriting as easy for the x86 as for the JVM, and asked to what extent a JVM was necessary to achieve the claimed benefits. "Why isn't the title, `Distributed Services,' period?" Sirer agreed, saying that they limited their scope to their experience.

Sitaram Iyer from Rice asked whether the injected verification bloats code, and Sirer responded that the effect was quite modest.

Drew Dean from Xerox PARC asked about interactions among multiple class loaders. Sirer said that the subject was deep, but that all checks that require knowledge of the class loader are done in the injected dynamic code. Dean also asked how the system prevents attackers from creating a remote shadow class intended to replace a server-created class, and Sirer replied that they simply reject any remote class in the package "edu.washington."

Jon Tidswell of IBM Research asked whether the approach would scale to tens of thousands of clients at the University of Washington. Would memory on the server be a bottleneck? Sirer answered that VM would be the limiting factor, but that their benchmarks represented a worst-case scenario. He indicated that a beefy server at the firewall should handle plenty of clients. Tidswell recalled the 60-70% ideal caching figure for web documents from a prior talk, and asked how cacheable Java classes are. Sirer said he had no numbers. However, your scribe suspects that they would be very cacheable, since any dynamic content viewed in a Java applet is likely to appear as variability in a data file, not as a dynamically generated Java class file.

For more information, see http://kimera.cs.washington.edu/.