Summarized by Bill Dieter, Univ. of Kentucky
Robert Morris, Eddie Kohler, John Jannotti, and M. Frans Kaashoek (MIT)
Eddie, an MIT student, presented this award paper. Click allows system administrators to easily build and extend a software router running in Linux by specifying how packets flow between provided router elements. Each router element performs one simple function, such as packet classification, scheduling, queuing, or traffic shaping. Click modules are connected with "push" or "pull" connections, which describe the packet flow through the router. An upstream module transfers packets to a downstream module over a push connection, pushing the data downstream. With a pull connection the downstream module initiates the transfer, pulling the data out of the upstream module.
A simple Click IP router routing between two 100 Mbps interfaces ran 90% as fast as a Linux machine configured as a router. Click was able to route up to 73,000 64-byte packets per second vs. around 81,000 64-byte packets per second for Linux. Adding the random early detection (RED) dropping policy to the Click router reduced its performance to about 93% of the simple Click IP router.
Q: Jay Lepreau, Utah: Handling interrupts takes 70-80% of your overall time. If you get rid of interrupts by polling you will not look so good. Can you comment on this? A: That depends on what you mean by look bad. We measured a 33 microsecond per packet latency for Click and a 28 microsecond latency for Linux. Much of the overhead is coming from virtual function calls. We may try dynamic code generation to get around the virtual function call overhead.
Q: Rimon Barr, Cornell: Can you learn from database optimization techniques in the router space? A: Thanks for the pointer to databases. They have done quite a bit of optimization work. We could generate one element that does the work of many and automatically substitute it for a group of elements. We have manually rewritten a sequence of elements into one large element and seen over 20% improvement in some cases. We also plan to use dynamic code generation to try to reduce overhead.
Q: Unknown: Not all modules are input/output driven; some are timer driver. How do you model that and how do you guarantee schedulability? A: We implement timers with timers. Click has timer elements. Scheduling is harder. Scheduling will be important in the future. We will look more at how Scout and others do scheduling. Routers have lots of queues, which are natural areas for scheduling. We may look at that too and try to figure something out.
Q: Unknown: You described how to get queue length. How do you get information about other kinds of elements? A: Click can look for any property an element might have. Elements can be designed to have any property you might want. The queue in the talk is just one example of how to find out about a property of an element.
Q: Timothy Roscoe, Sprint: Many routers use routing protocols that alter the routing table quite often. How long does it take to reconfigure the routing tables if you are doing a multicast join or something complex like that? A: We don't have figures for that because we don't have a normal routing table. Writing a new configuration takes less than 15 milliseconds. We can write to the /proc filesystem to change what an element does. To do routing we use a user level process like "gated" to poke the routing information into /proc filesystem.
Q: Ken Birman, Cornell: This question comes from naivete. How does Click compare with other work? How is the configurability problem classically addressed within routers such as Cisco? A: We don't know a lot about Cisco because their routers are proprietary. Cisco has their own software infrastructure. From talking to them, their routers are not as modular for out of the box solutions. Linux and BSD router code is mixed in with the rest of the kernel.
For more information, see http://www.pdos.lcs.mit.edu/click/.
Mohit Aron and Peter Druschel (Rice University)
Mohit, a Rice student, presented this award paper. Soft timers schedule timed events by checking for timer expiration in frequently executed parts of the kernel. Checking for timer expiration in the kernel reduces context switch and cache overhead compared to traditional hardware-based timers which interrupt when the event occurs. Techniques like rate-based clocking of TCP connections and network polling that suffer from the high overheads of hardware based timers on high speed networks become practical with soft timers.
Q: Unknown, Bell Labs: I like this paper, it looks good. However, you assume incorrectly that hard timers interrupt on every clock tick. That is incorrect. You can have the hardware interrupt go off every N cycles if nothing is scheduled. A: In a high bandwidth network the operating system needs to schedule lots of interrupts close together.
Q: Jonathan Shapiro, IBM: You measured performance and showed results for applications where things happen all the time. What used to happen at every tick now happens more often. What are the implications of this for real-time applications? A: Soft timers provide best effort triggers. They are not guaranteed to happen at precisely the time they are scheduled, so some events may miss their deadlines. If an application cannot tolerate that it will have to use hardware timers.
Q: Eric Cota-Robles, Intel: I agree there are problems with hardware timers. As a user of interrupts, interrupts have not changed since the 60's. What would be a better hardware interface? If the hardware interface could be redesigned, how would you do it? A: It would have as low overhead as soft timers. [audience laughs] Q:You don't care how to do it? A: No.
Q: Unknown: I agree on the usefulness of this facility. How will it change as hardware changes? Is the assumption true that interrupts will always have high overhead? A: That is hard to answer. On receiving an interrupt the CPU still needs to save some state, and interrupts will still cause cache locality losses.
Q: Greg Minshall, Siara Systems: Rate-based pacing is interesting, but it is not done because of the timer overhead, so this work addresses that problem well. There are a number of things that will speed up flow of TCP connections. What are the implications for the rest of the net of using rate-based pacing? You need to prevent a bursty load to the network compared with slow start. It is unfair to say rate-based pacing is 10 times better than TCP. It is better to say the load is spread over more time. Please withdraw that claim. A: We mentioned that slow start has burstiness because it must wait for ACKs from the receiver. There is no reason for the sender to wait if the bandwidth of the network is known.
Q: Unknown, Univ. of Washington: Knowing the bandwidth is a big assumption. UW has done rate-based pacing and found that it didn't work well for many parameters because of properties with lots of connections. Feedback protocols are sensitive to changes. If an application uses the network less it could affect the net performance. Will applications resort to voodoo, like calling getpid(), to improve performance? A: We can use techniques to approximate bandwidth when opening a connection, such as assuming it is the same as the bandwidth for the last connection to the same host. Packet dynamics need to be studied more with rate-based pacing.
For more information, see http://www.cs.rice.edu/CS/Systems/Soft-timers/.. Slides from this presentation are at http://www.cs.rice.edu/CS/Systems/Soft-timers/soft-timers-talk.ps.gz.