Vector Search and Databases
Dr. Yannis Papakonstantinou
Distinguished Engineer, Query Processing and GenAI at Google Cloud Databases
Abstract: Semantic search ability, via embedding (vectors) and vector indexing, has been added to Google Cloud Platform (GCP) databases in order to enable GenAI applications. The inclusion of vectors in databases confers many of the traditional benefits of databases: Developers can now develop GenAI applications on their familiar and trusted databases. Furthermore, developers can be sure that the vectors are also up-to-date and transactionally consistent. The rapid adoption of the postgres pgvector extension is evidence of the appreciation of these benefits by the database developer community. The inclusion of vectors in databases raises three R&D questions, which we will discuss in this talk. First, can databases with vector abilities perform as well as purpose-built vector databases in pure vector search? What does it take to achieve this? Second, what are the opportunities and respective R&D challenges that emerge at the intersection of SQL data and vectors? Finally, what does it take to facilitate and align the experience of SQL developers with the world of vector management and vector indexing?
Bio: Yannis Papakonstantinou is a Distinguished Engineer, working on Query Processing and GenAI, at Google Cloud. He is also an Adjunct Professor of Computer Science and Engineering at the University of California, San Diego, following many years of having been a UCSD regular faculty member. Previously he was an architect in query processing & ETL at Databricks. Earlier, he was a Senior Principal Scientist at Amazon Web Services from 2018-2021 and was a consultant for AWS since 2016. He was the CEO and Chief Scientist of Enosys Software, which built and commercialized an early Enterprise Information Integration platform for structured and semistructured data. The Enosys Software was OEM’d and sold under the BEA Liquid Data and BEA Aqualogic brand names, eventually acquired in 2003 by BEA Systems.
His R&D work has been mostly on query processing with focus on querying semistructured data. He has published over one hundred twenty research articles that have received over 21,000 citations. Yannis holds a Diploma of Electrical Engineering from the National Technical University of Athens, MS and Ph.D. in Computer Science from Stanford University (1997).