My research is in machine learning, with the following goals:
-
learning densities from data
-
generating samples from an unnormalized distribution and/or data
-
learning representations and causal structure from data
I am particularly interested in the computational and sample efficiency of algorithms that achieve these goals.
The data I use comes in different modalities, such as synthetic data, image data and molecular data. In particular, non-invasive brain recordings (M/EEG) were central to my PhD work and remain a data modality I work on.
Here are keywords that best describe my research areas:
- Energy-Based Models
- Diffusion and Flow Models
- Sampling Algorithms
- Score Matching
- Density Ratio Estimation
- Causal Discovery
- Representation Learning
- Brain Imaging
- Multi-View Independent Component Analysis
Service. I review submissions to the following machine learning conferences:
NeurIPS, ICML, ICLR and AISTATS.
I also occasionally review submissions to these journals of statistics or machine learning:
JMLR, TMRL, AISM, JUQ, and Mathematical Methods of Statistics.
I am grateful to have been recognized as a "top reviewer" (AISTATS 2022, NeurIPS 2022-23-24).
Generating Samples
Few-Step Boltzmann Generators via Scalable Likelihood Flow Maps
RuiKang OuYang, Hanlin Yu, Xinyue Ai, Yutong He, Nicholas M. Boffi, Pradeep Ravikumar, José Miguel Hernández-Lobato, Max Simchowitz, Benjamin Kurt Miller, Omar Chehab
Workshop on Structured Probabilistic Inference & Generative Modeling, International Conference on Machine Learning (ICML), 2026
Sampling from multi-modal distributions with polynomial query complexity in fixed dimension via reverse diffusion
Adrien Vacher, Omar Chehab, Anna Korba
Conference on Neural Information Processing Systems (NeurIPS), 2025
Time-reversed diffusions are state-of-the-art for sampling multi-modal distributions, but they rely on score estimates. We analyze how estimation errors affect the final samples.
Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics
Omar Chehab, Anna Korba, Austin Stromme, Adrien Vacher
International Conference on Learning Representations (ICLR), 2024
Annealed MCMC tries to approximate a prescribed path of distributions. We show that the popular geometric mean path with a Gaussian has unfavorable geometry. Presented at the Yale workshop on sampling.
A Practical Diffusion Path for Sampling
Omar Chehab, Anna Korba
Workshop on Structured Probabilistic Inference & Generative Modeling, International Conference on Machine Learning (ICML), 2024
Time-reversed diffusions are state-of-the-art in sampling but rely on score estimates. We aim to reduce their variance.
Learning densities
Learning Energy-Based Models from Stochastic Interpolants using Spatiotemporal Differences
Hanlin Yu, RuiKang OuYang, Partha Kaushik, Arto Klami, Michael U. Gutmann, Omar Chehab
arXiv, 2026
Conditional Noise-Contrastive Estimation of Energy-Based Models by Jumping Between Modes
Hanlin Yu, Michael U. Gutmann, Arto Klami, Omar Chehab
Workshop on Principles of Generative Modeling, EurIPS, 2025
We explore the design choices of a method called CNCE for learning energy-based models.
Density Ratio Estimation with Conditional Probability Paths
Hanlin Yu, Arto Klami, Aapo Hyvärinen, Anna Korba, Omar Chehab
International Conference on Machine Learning (ICML), 2025
A density ratio can be obtained by integrating the time score of a probability path. We present an efficient way to estimate the time score.
Optimizing the Noise in Self-Supervised Learning: from Importance Sampling to Noise-Contrastive Estimation
Omar Chehab, Alexandre Gramfort, Aapo Hyvärinen
arXiv, 2023
Provable benefits of annealing for estimating normalizing constants: Importance Sampling, Noise-Contrastive Estimation, and beyond
Omar Chehab, Aapo Hyvärinen, Andrej Risteski
Conference on Neural Information Processing Systems (NeurIPS), 2023 [Spotlight]
Annealed Importance Sampling uses a prescribed path of distributions to compute an estimate of a normalizing constant. We quantify how the choice of path impacts the estimation error.
The Optimal Noise in Noise-Contrastive Learning Is Not What You Think
Omar Chehab, Alexandre Gramfort, Aapo Hyvärinen
Conference on Uncertainty in Artificial Intelligence (UAI), 2022
NCE estimates the data density by minimizing a binary classification loss, between data and noise samples. We find the optimal noise distribution that minimizes the estimation error.
Learning Representations and Causal Structure
Multi-View Causal Discovery without Non-Gaussianity: Identifiability and Algorithms
Ambroise Heurtebise, Omar Chehab, Pierre Ablin, Alexandre Gramfort, Aapo Hyvärinen
International Conference on Machine Learning (ICML), 2026
Workshop on Causality for Impact, Practical challenges for real-world applications of causal methods, EurIPS 2025 [Oral]
We propose three algorithms with theoretical guarantees for learning causal relationships (Directed Acyclic Graph) between random variables, given correlated measurements. We apply our algorithms to brain recordings (MEG and fMRI).
MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Ambroise Heurtebise, Omar Chehab, Pierre Ablin, Alexandre Gramfort
IEEE Transactions on Biomedical Engineering, 2025
Independent Component Analysis (ICA) is a popular algorithm for learning a representation of data. We propose a version that handles data collected from different contexts, and whose representations differ only by temporal delays or dilations.
Deep Recurrent Encoder: an end-to-end network to model magnetoencephalography at scale
Omar Chehab*, Alexandre Defossez*, Jean-Christophe Loiseau, Alexandre Gramfort, Jean-Remi King
Journal of Neurons, Behavior, Data analysis, and Theory, 2022
We compare different models for predicting the brain’s response to external stimuli. Our model, based on a deep neural network, is more accurate and interpretable.
Learning with self-supervision on EEG data
Alexandre Gramfort, Hubert Banville, Omar Chehab, Aapo Hyvärinen, Denis Engemann
IEEE workshop on Brain-Computer Interface, 2021
We learn rich representations of EEG brain activity using a self-supervised loss.
Uncovering the structure of clinical EEG signals with self-supervised learning
Hubert Banville, Omar Chehab, Aapo Hyvärinen, Denis Engemann, Alexandre Gramfort
Journal of Neural Engineering, 2021
We learn rich representations of EEG brain activity using a self-supervised loss.
A mean-field approach to the dynamics of networks of complex neurons, from nonlinear Integrate-and-Fire to Hodgkin–Huxley models
Mallory Carlu, Omar Chehab, Leonardo Dalla Porta, Damien Depannemaecker, Charlotte Héricé, Maciej Jedynak, Elif Köksal Ersöz, Paulo Muratore, Selma Souihe, Cristiano Capone, Yann Zerlaut, Alain Destexhe, Matteo di Volo
Journal of Neurophysiology, 2020
Our theory predicts the average behavior of neuronal populations that fire asynchronously.
Miscellaneous
Nano World Models: A Minimalist Implementation of Future Video Prediction
Siqiao Huang, Partha Kaushik, Michael Chen, Hengkai Pan, Omar Chehab, Fernando Moreno-Pino, Max Simchowitz
arXiv, 2026
A minimalist codebase for future video prediction centered around diffusion forcing.