Statistical Testing for high-dimensional Models : Leveraging data structure for higher efficiency and accuracy

Jeudi 14 décembre 2017 14:00-15:00 - Bertrand Thirion - INRIA

Résumé : In many scientific applications, increasingly-large datasets are being acquired to describe more accurately biological or physical phenomena. While the dimensionality of the resulting measures has increased, the number of samples available is often limited, due to physical or financial limits. This results in impressive amounts of complex data observed in small batches of samples. A question that arises is then : what features in the data are really informative about some outcome of interest ? This amounts to inferring the relationships between these variables and the outcome, conditionally to all other variables. Providing statistical guarantees on these associations is needed in many fields of data science, where competing models require rigorous statistical assessment. Yet reaching such guarantees is very hard.
In this presentation, we will first motivate the quest for inference models by examples from applied statistical problems. We will them review existing solutions, together with their strengths and weaknesses and outline promising directions. We will eventually discuss how to introduce structure in such models while retaining statistical guarantees.

Lieu : salle 117/119 du bâtiment 425

Statistical Testing for high-dimensional Models : Leveraging data structure for higher efficiency and accuracy  Version PDF