BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//UC Irvine Donald Bren School of Information &amp; Computer Sciences - ECPv6.3.4//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:UC Irvine Donald Bren School of Information &amp; Computer Sciences
X-ORIGINAL-URL:https://ics.uci.edu
X-WR-CALDESC:Events for UC Irvine Donald Bren School of Information &amp; Computer Sciences
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20250309T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20251102T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20250930T160000
DTEND;TZID=America/Los_Angeles:20250930T170000
DTSTAMP:20260608T141341
CREATED:20250917T211201Z
LAST-MODIFIED:20250917T212126Z
UID:26053-1759248000-1759251600@ics.uci.edu
SUMMARY:Statistics in the Age of AI: Theory\, Methods\, and Data
DESCRIPTION:Abstract: Artificial Intelligence (AI) has surged in popularity\, creating both opportunities and challenges for statistics. In this talk\, I will present three recent directions from my lab that reflect our efforts to engage with the age of AI. First\, I will discuss theoretical results for decoder-based generative models\, providing statistical foundations that connect latent dimension\, approximation error\, and model complexity. Second\, I will discuss a method to use embeddings from large language models to enhance high-dimensional hypothesis testing\, a widely used statistical tool in scientific domains\, motivated by problems in cancer genomics where traditional methods are underpowered. I will also discuss extensions to genetic studies\, where we curated annotations for 8.9 billion genetic variants from the human genome\, and obtained embeddings of these 8.9B variants for downstream tasks such as GWAS and phenotype prediction. Finally\, I will switch to an infrastructural view\, introducing STimage-1K4M\, one of the first and largest publicly available spatial transcriptomics datasets curated by my group\, consisting of 1\,149 slides and more than 4 million pathology image‚Äìgene expression pairs across 10 species and 50 tissue types. This resource has been downloaded over 180\,000 times on HuggingFace and has facilitated the training of multimodal foundation models. Together\, these examples illustrate how theory\, methodology\, and data curation advance both statistics and AI.
URL:https://ics.uci.edu/event/statistics-in-the-age-of-ai-theory-methods-and-data/
LOCATION:Donald Bren Hall\, Irvine\, CA\, 92697\, United States
ATTACH;FMTTYPE=image/jpeg:https://ics.uci.edu/wp-content/uploads/2025/09/Didong-Li-2-resize.jpg
END:VEVENT
END:VCALENDAR