La séance est reportée au 7 novembre

Jeudi 24 octobre 14:00-15:00

Résumé : Stéphane Gaïffas
(travail en collaboration avec Jaouad Mourtada et Erwan Scornet)
We introduce a new procedure called SMP (Sample Minimax Predictor) for predictive conditional density estimation, which satisfies a general excess risk bound under logarithmic loss. This bound remains valid in the misspecified case, and scales as d / n in several cases, where d is the model dimension and n the sample size.
In particular, and contrary to the maximum likelihood, the performance of this procedure does not significantly degrade under model misspecification.
We deduce a minimax procedure for misspecified density estimation in logistic regression, with a sharp excess risk of d / n + o(1/n), addressing an open problem by Kotlowski and Grunwald (2011).
For logistic regression, the predictions of SMP come at the cost of two logistic regressions, hence are easier to compute than the approaches based on Bayesian predictive posteriors, which require posterior sampling instead of optimization.
From a theoretical point of view, SMP bypasses existing lower bounds for proper estimators, which return a conditional distribution that belongs to the logistic model. Results from Hazan et al (2014) (see also Bach and Moulines, 2013) imply that the excess risk rate of such procedures is either slow O (1 / \sqrtn) or exhibits an exponential dependence on the scale of the covariates for some worst-case distributions. It was shown recently by Foster et al (2018) that one can achieve a fast rate O(d \log n / n) using a mixture of Bayesian predictive posteriors. A Ridge-regularized variant of SMP also satisfies a fast rate, and therefore provides a computationally appealing alternative to the approach of Foster et al (2018).

Lieu : 3L15 - IMO

Notes de dernières minutes : Attention, cette séance est reportée au 7 novembre

La séance est reportée au 7 novembre  Version PDF