Skip to main content

Arkajyoti (Arka) Saha

“I combine the power of statistical modeling with the scalability and flexibility of AI/ML, bringing together the best of both worlds.”

Integrating Statistics with AI
Professor Arka Saha’s research integrates the theory and practice of artificial intelligence (AI) and machine learning (ML) with statistics. “AI/ML methods often forego heavy model assumptions of classical statistics in favor of a model-free data-driven approach,” he says. “Though this lends scalability and flexibility to the AI/ML methods, they frequently neglect the data’s inherent structure.” Classical statistics has long been used to model these domain-specific structures, with scientists’ domain expertise as a foundation. “I combine the power of statistical modeling with the scalability and flexibility of AI/ML, bringing together the best of both worlds.”

Understanding Data Dependence
AI/ML models often ignore the data’s dependent structure, assuming independence implicitly or explicitly. However, this dependence is frequently of crucial importance; failing to address it may result in poor efficacy, and the structure itself may be of scientific significance. Professor Saha turns such challenges into assets by explicitly modeling the dependence using statistical tools. “Using the knowledge that the observations or features are dependent, I ‘borrow strength’ across the rows or columns of the data, increasing the power and accuracy of the AI/ML approaches while also providing insight into the dependence structure itself.” He applies this concept to integrate spatial and temporal models into a machine learning framework, which is a key element of his methodological research.

Collaborating for Real-World Impact
Professor Saha focuses on collaborating with scientists to solve open scientific challenges by merging AI/ML approaches with domain expertise via statistics. “My research paradigm on data dependence is of fundamental interest in environmental science, biomedical sciences, oceanography, finance, data privacy, and algorithmic fairness,” he says. “I also work with earth system scientists to evaluate the level of carbon in oceans. This allows us to better understand, forecast, and address a critical component of global environmental change, which can aid in developing policies for a more sustainable future.”


Education
Ph.D., Biostatistics, John Hopkins University, 2021

Master of Statistics, Indian Statistical Institute, 2016

Bachelor of Statistics, Indian Statistical Institute, 2014

Research Areas

View Biostatistics

Biostatistics

The application of statistical methods to analyze and interpret data in the fields of biology …

View Genomics

Genomics

An interdisciplinary field focusing on the structure, function, evolution, mapping and editing of genomes

View Sustainability and Computing

Sustainability and Computing

Developing innovative ways to use and develop computational technologies to address environmental and societal challenges …

Recent News

See all news
Skip to content