The storage system in your computer is its slowest component, and also the one that must be the most reliable (unless you like losing your files!). In collaboration with Stony Brook University, we are working on ways to speed up storage, measure its performance, and validate its reliability. Each of these tasks is a separate sub-project under the same overarching supervision; we will be assigning students to specific projects when summer arrives.
Two students will work with Prof. Erez Zadok of Stony Brook and his graduate students; two others will work directly with Prof. Kuenning at HMC. In all cases we will be developing and enhancing software, running experiments, and measuring results. Some sample projects include:
- Re-Animator is a system that allows us to record the behavior of an application and later replay that record to reproduce the behavior. The system exists, but needs to be enhanced to better handle parallel applications. In addition, we will be using Re-Animator to investigate the behavior of storage systems. Finally, Re-Animator is being released to the public as open-source software, so we need to ensure that it is consistently bulletproof and easy to use.
- We are in the middle stages of a project to improve the design of multi-tier caching systems. Most large Web sites use several levels of software caches to improve performance; designing the various levels to do the best job at the lowest costs is a hard (probably NP-hard) problem. We have developed promising algorithms that quickly generate performance curves and then find "knees" in the curves that indicate places where performance can be dramatically improved for a relatively small extra cost; studying these knees in more detail can then lead to an optimal solution. We are continuing to work on these algorithms and are running experiments to measure their effectiveness.
- In a new project, we are using the SPIN model checker to find bugs in file systems. This approach promises to have a major impact on the reliability of popular file systems, but it turns out to be unusually challenging to get SPIN to run at high speed. The reason is that SPIN regularly saves the state of the computation it is checking and later rolls back to that state, but the internal state of a file system is complex and is not directly available to SPIN. We are developing new approaches to solve this problem so that we can run checks efficiently. So far we have found two bugs in a toy file system but are still improving how our approach works with real-life ones.
Professors Kuenning and Zadok have been collaborating for over a decade. Prof. Zadok's research group at Stony Brook is one of the most productive and respected storage research groups in the world, with dozens of publications in top venues. This project give you a chance to work alongside graduate students on cutting-edge projects that will have real-world impact on the performance and reliability of storage systems.