Skip to main content

To understand the expansive reach of computer science today, look no further than some of the recent research collaborations of Pierre Baldi, Distinguished Professor of Computer Science in the Donald Bren School of Information and Computer Sciences (ICS). He is working on multidisciplinary projects involving natural language processing, deep learning, chemoinformatics, and bioinformatics, partnering with faculty from the Departments of Physics & Astronomy, Mathematics, and Chemistry, and well as from the UCI Center for the Neurobiology of Learning and Memory (CNLM).

The first project is “Developing Natural Language Processing Tools for Mining the Rapidly Evolving COVID-19 Literature.” Supported by UCI Emergency COVID-19 Research Seed Funding, this is a partnership between Baldi and UCI Assistant Professor of Physics & Astronomy Huolin Xin. The goal is to develop a novel NLP tool that can mine medical literature for information about COVID-19. This tool would help the medical community answer questions about virus incubation and transmission, proposed therapeutics, and ethical and social science considerations.

Another project is “Foundations of Deep Learning,” a collaboration between Baldi and UCI Professor of Mathematics Roman Vershynin. Funded by a grant from the Army Research Laboratory, the three-year project aims to address open questions about measuring the capabilities of a given neural architecture or measuring the information contained in a training set. “We propose to take a significant step in addressing this fundamental technological gap by developing a precise, quantitative, theory of neural network capacity and generalization properties,” says Baldi. “These ideas will be investigated theoretically, by continuing to develop the mathematical theory of capacity, and also through systematic simulations conducted on synthetic datasets as well as standard benchmark datasets.” By developing a comprehensive theory of deep learning, this work will help advance our understanding of AI as it increasingly appears in everything from computer vision and speech recognition to self-driving cars and healthcare.

A third collaboration Baldi is currently working on is a National Science Foundation project, “Identification of Products and Pathways in Organic Reactions.” Partnering with UCI Professor of Chemistry David VanVranken, the long-term goal, as outlined in the project summary, is to “develop a tool, powered by data and deep learning, that accurately and rapidly matches mass spectroscopy data to products of organic reactions based on step-wise elementary reaction steps.” This will help speed up searches involving complex reactants. “In preliminary work towards this goal, we created a training set of thousands of elementary reaction steps, and applied deep learning to first identify the most reactive pairs of electron source and electron sink atoms, and then rank bonding combinations of those atom pairs.” By the end of the three-year project, the researchers hope to assign plausible structures to products of radical reactions and complex polar reactions based on mass spectrometry data.

Finally, Baldi is also part of a collaboration with Matthew Lattal, professor of behavioral neuroscience in the School of Medicine at Oregon Health & Science University, and Marcelo Wood, professor of neurobiology and behavior in the School of Biological Sciences at UCI’s CLNM. With a grant from the National Institutes of Health, the team is working on a project called “Mechanisms of Maladaptive Memory Formation and Suppression in a Preclinical Model of the Comorbidity between PTSD and Addiction.” Treatment for both post-traumatic stress disorder and substance use disorders aims to weaken the ability of environmental cues to induce relapse. As noted in the project summary, “one way to do this is through extinction techniques, in which the relation between the cue and the drug, or the cue and the traumatic memory, is severed.” However, successful treatment with extinction often does not persist and relapse occurs with time, changes in context or exposure to stress. The team is trying to address this issue, focusing in particular on how learned fear and drug seeking interact at behavioral and molecular levels. “My role is to oversee all the bioinformatics and statistical analyses of the omic data produced by this project in order to identify relevant pathways and molecular mechanisms,” says Baldi. The focus on epigenetic mechanisms could help researchers better understand how trauma leads to persistent changes in behavior. This, in turn, could lead to the development of novel therapeutic approaches.

Shani Murray