Resampling and Model selection
Version française de cette page - Back to index
I completed my Ph.D. in Mathematics at University Paris-Sud (Orsay). My advisor was Pascal Massart.
My Ph.D. thesis was awarded the price
Marie-Jeanne Laurent-Duhamel 2011 from the French
Statistical Society (SFDS).
The PhD manuscript, the slides of the defence and an abstract of my PhD are also available at TEL.
- Final version of the manuscript: [pdf] Notice that it is written in english, except the first chapter (chapter 2 being a shorter introduction, in english).
- Extended table of contents of the manuscript [pdf]
- Slides of the Ph.D. defense: [pdf]
This thesis takes place within the theories of non-parametric statistics and statistical learning. Its goal is to provide an accurate understanding of several resampling or model selection methods, from the non-asymptotic viewpoint.
The main advance in this thesis consists in the accurate calibration of model selection procedures, in order to make them optimal in practice for prediction. We study V-fold cross-validation (very commonly used, but badly known in theory, in particular for the question of choosing V) and several penalization procedures. We propose methods for calibrating accurately some penalties, for both their general shape and the multiplicative constants. The use of resampling allows to solve hard problems, in particular regression with a variable noise-level. We prove non-asymptotic theoretical results on these methods, such as oracle inequalities and adaptivity properties. These results rely in particular on some concentration inequalities.
We also consider the problem of confidence regions and multiple testing, when the data are high-dimensional, with general and unknown correlations. Using resampling methods, we can get rid of the curse of dimensionality, and "learn" these correlations. We mainly propose two procedures, and prove for both a non-asymptotic control of their level.
Non-parametric statistics ; statistical learning ; resampling ; non-asymptotic ; V-fold cross-validation ; bootstrap ; model selection ; penalization ; nonparametric regression ; adaptivity ; heteroscedastic ; confidence regions ; multiple testing
62G09 ; 62M20 ; 62G08 ; 62J02 ; 62G15 ; 62G10
Ph.D. defense board of examiners
M. Patrice BERTAIL ; CREST and University Paris-X (Examiner)
M. Philippe BERTHET ; University Rennes-I (Examiner)
M. Gilles BLANCHARD ; Fraunhofer FIRST, Berlin (Examiner)
M. Stéphane BOUCHERON ; University Paris-VII (President)
M. Olivier CATONI ; CNRS et University Paris-VI (Examiner)
M. Pascal MASSART ; University Paris-Sud XI (Advisor)
Mr. Peter L. BARTLETT ; University of California, Berkeley
Mr. Yuhong YANG ; University of Minnesota