There is a growing need to apply machine intelligence and learning at the edge of the cloud. Doing so would reduce delays for interactive LLM queries, enable expanded use of assistive technology for drivers, and let us bring AI into settings like factories and hospitals. However, the cloud was designed mostly to support scalable web applications, and many of those design choices are mismatched relative to the needs of edge intelligence applications. Our new system, Cascade, overcomes these issues. Cascade enables a fast reactive edge in which AIs always see consistent, current data. The work centers on a rethinking of the lowest levels of modern computing frameworks and entails eliminating copying, locking, and synchronous distributed interactions. This leads to an asynchronous flow model that yields dramatic reductions in delay, improved throughput, and even saves power.
Bio: Ken has worked in distributed systems since getting his PhD at UC Berkeley in 1991. He is currently the N. Rama Rao Professor in the Department of Computer Science at Cornell. Past projects including building the infrastructure that operated the New York and Swiss Stock Exchanges, developing the core architecture and software for the French Air Traffic Control System, and inventing some of the techniques that enabled today’s scalable, self-managed, cloud computing settings. Ken is a fellow of the ACM and IEEE, and won the IEEE Kanai Award for his research in distributed systems.