RISK BOUNDS FOR MODEL SELECTION VIA PENALISATION
Auteurs : Andrew Barron (Yale University), Lucien Birgé
(Université Paris VI), Pascal Massart (Université Paris
Sud)
Mots Clés : Penalization, Model selection, Adaptive estimation,
Empirical processes, Sieves, Minimum contrast estimators.
Classification MSC : primary 62G05, 62G07, secondary 41A25
Résumé :
-
Performance bounds for criteria for Model Selection are developed using recent
theory for sieves. The model selection criteria are based on an empirical
loss or contrast function with an added penalty term motivated by empirical
process theory and roughly proportional to the number of parameters needed
to describe the model divided by the number of observations. Most of our
examples involve regression or density estimation settings and we focus on
the problem of estimating the unknown density or regression function. We
show that the quadratic risk of the "penalized minimum contrast estimator"
is bounded by an index of the accuracy of the sieve. This accuracy index
quantifies the trade-off among the candidate models between the approximation
error and parameter dimension relative to sample size. If we choose a list
of models which exhibit good approximation properties with respect to different
classes of smoothness, the estimator can be simultaneously minimax rate optimal
in each of those classes. This is what is usually called {\em adaptation}.
The type of classes of smoothness in which one gets adaptation depends heavily
on the list of models. If too many models are involved in order to get accurate
approximation of many wide classes of functions simultaneously, it may happen
that the estimator is only approximately adaptive (typically up to a slowly
varying function of the sample size). We shall provide various illustrations
of our methods such as penalized maximum likelihhood, projection or least
squares estimation. The models will involve commonly used finite dimensional
expansions such as piecewise polynomials with fixed or variable knots,
trigonometric polynomials, wavelets, neural nets and related nonlinear expansions
defined by superposition of ridge functions.
Article :
Fichier
Postscript
Contact :
Pascal.Massart@math.u-psud.fr