Skip to main content

Hydro: A Compiler Stack for Distributed Programs

Joseph Hellerstein

Professor of Computer Science, UC Berkeley

Abstract: Nearly all programs of interest today are distributed. Unfortunately, the traditional languages and compilers in common use today offer little assistance in ensuring the correctness of distributed programs. This state of affairs makes infrastructure development and tuning unduly expensive, and hampers the ability of less-technical but highly creative individuals to invent new applications that take advantage of the ubiquity of cloud and mobile computing.

The Hydro project at Berkeley is an effort to build a compiler stack to address these issues, taking lessons from the success of scaling data management software. The foundation of the Hydro stack is Hydroflow, a Rust-based dataflow runtime with an IR based on algebraic dataflow. Hydroflow enables a compiler to make correct program transformations that are natural in the context of distributed systems. Transformations include:

  • Refactoring: Given an arbitrary block of code, refactor it into smaller blocks that can be launched on independent machines
  • Replication: Given an arbitrary block of code, determine whether it can be safely replicated in deployment
  • Partitioning: Given an arbitrary block of code, determine how its inputs can be safely partitioned (“sharded”) to multiple machines in deployment

These transformations in turn allow distributed programs to be optimized for various goals, including parallelism (both pipelines and partitioning), memory scaling, performance isolation, geoproximity and physical security.

Although the Hydro project is still in early stages, I will present case studies showing correctness, latency and scaling results when optimizing programs ranging from infrastructure like key-value stores, applications like shopping carts and messaging systems, and tricky consensus protocols.

Joint work with colleagues at UC Berkeley and Sutter Hill Ventures.

Bio: Joseph M. Hellerstein is the Jim Gray Professor of Computer Science at UC Berkeley, and a Faculty Fellow at Sutter Hill Ventures. His academic recognition includes the ACM SIGMOD Codd Innovations Award, ACM Fellow and Sloan Research Fellow awards, and six “Test of Time” awards for his papers. Hellerstein is a longtime participant in the computing industry, co-founding startups, advising companies and venture funds, and directing industry research. He also enjoys playing music, and has performed live with legendary musicians including Joe Henderson, Joshua Redman and Michael J. Carey.