Comparing multivariate distributions: an optimal transport based approach
Seminar room no 24, 1st floor, Main Building, IISER Pune
Abstract
Quantile-Quantile (Q-Q) plots are widely used for assessing the distributional similarity between two univariate datasets. Q-Q plots in multivariate settings, however, fail to capture complex dependencies present in the data. In this work, we propose a novel approach for constructing multivariate Q-Q plots, which extend the traditional Q-Q plot methodology to handle high-dimensional data. Our approach utilizes optimal transport (OT) and entropy-regularized optimal transport (EOT) to align the empirical quantiles of the two datasets. Additionally, we introduce another technique based on OT and EOT potentials which can effectively compare two multivariate datasets. Through extensive simulations and real data examples, we demonstrate the effectiveness of our proposed approach in capturing multivariate dependencies and identifying distributional differences such as tail behaviour. We also propose two test statistics based on the Q-Q and potential plots to compare two distributions rigorously. This talk is based on a joint work with Sibsankar Singha and Marie Kratz.