Histogram selection in non Gaussian regression

Marie Sauvé

ESAIM: Probability and Statistics (2009)

  • Volume: 13, page 70-86
  • ISSN: 1292-8100

Abstract

top
We deal with the problem of choosing a piecewise constant estimator of a regression function s mapping 𝒳 into . We consider a non Gaussian regression framework with deterministic design points, and we adopt the non asymptotic approach of model selection via penalization developed by Birgé and Massart. Given a collection of partitions of 𝒳 , with possibly exponential complexity, and the corresponding collection of piecewise constant estimators, we propose a penalized least squares criterion which selects a partition whose associated estimator performs approximately as well as the best one, in the sense that its quadratic risk is close to the infimum of the risks. The risk bound we provide is non asymptotic.

How to cite

top

Sauvé, Marie. "Histogram selection in non Gaussian regression." ESAIM: Probability and Statistics 13 (2009): 70-86. <http://eudml.org/doc/250623>.

@article{Sauvé2009,
abstract = { We deal with the problem of choosing a piecewise constant estimator of a regression function s mapping $\mathcal\{X\}$ into $\mathbb\{R\}$. We consider a non Gaussian regression framework with deterministic design points, and we adopt the non asymptotic approach of model selection via penalization developed by Birgé and Massart. Given a collection of partitions of $\mathcal\{X\}$, with possibly exponential complexity, and the corresponding collection of piecewise constant estimators, we propose a penalized least squares criterion which selects a partition whose associated estimator performs approximately as well as the best one, in the sense that its quadratic risk is close to the infimum of the risks. The risk bound we provide is non asymptotic. },
author = {Sauvé, Marie},
journal = {ESAIM: Probability and Statistics},
keywords = {CART; change-points detection; deviation inequalities; model selection; oracle inequalities; regression; change-point detection},
language = {eng},
month = {3},
pages = {70-86},
publisher = {EDP Sciences},
title = {Histogram selection in non Gaussian regression},
url = {http://eudml.org/doc/250623},
volume = {13},
year = {2009},
}

TY - JOUR
AU - Sauvé, Marie
TI - Histogram selection in non Gaussian regression
JO - ESAIM: Probability and Statistics
DA - 2009/3//
PB - EDP Sciences
VL - 13
SP - 70
EP - 86
AB - We deal with the problem of choosing a piecewise constant estimator of a regression function s mapping $\mathcal{X}$ into $\mathbb{R}$. We consider a non Gaussian regression framework with deterministic design points, and we adopt the non asymptotic approach of model selection via penalization developed by Birgé and Massart. Given a collection of partitions of $\mathcal{X}$, with possibly exponential complexity, and the corresponding collection of piecewise constant estimators, we propose a penalized least squares criterion which selects a partition whose associated estimator performs approximately as well as the best one, in the sense that its quadratic risk is close to the infimum of the risks. The risk bound we provide is non asymptotic.
LA - eng
KW - CART; change-points detection; deviation inequalities; model selection; oracle inequalities; regression; change-point detection
UR - http://eudml.org/doc/250623
ER -

References

top
  1. Y. Baraud, Model selection for regression on a fixed design. Probab. Theory Related Fields117 (2000) 467–493.  
  2. Y. Baraud, F. Comte and G. Viennet, Model Selection for (auto-)regression with dependent data. ESAIM: PS5 (2001) 33–49.  
  3. L. Birgé and P. Massart, Gaussian model selection. J. Eur. Math. Soc.3 (2001) 203–268.  
  4. L. Birgé and P. Massart, Minimal penalties for gaussian model selection. To be published in Probab. Theory Related Fields (2005).  
  5. L. Birgé and Y. Rozenholc, How many bins should be put in a regular histogram. ESAIM: PS10 (2006) 24–45.  
  6. O. Bousquet, Concentration Inequalities for Sub-Additive Functions Using the Entropy Method. Stochastic Inequalities and Applications56 (2003) 213–247.  
  7. L. Breiman, J. Friedman, R. Olshen and C. Stone, Classification And Regression Trees. Chapman et Hall (1984).  
  8. G. Castellan, Modified Akaike's criterion for histogram density estimation. C.R. Acad. Sci. Paris Sér. I Math.330 (2000) 729–732.  
  9. O. Catoni, Universal aggregation rules with sharp oracle inequalities. Ann. Stat. (1999) 1–37.  
  10. E. Lebarbier, Quelques approches pour la détection de ruptures à horizon fini. Ph.D. thesis, Université Paris XI Orsay (2002).  
  11. G. Lugosi and A. Nobel, Consistency of data-driven histogram methods for density estimation and classification. Ann. Stat.24 (1996) 786–706.  
  12. C.L. Mallows, Some comments on cp. Technometrics15 (1973) 661–675.  
  13. P. Massart, Notes de Saint-Flour. Lecture Notes to be published (2003).  
  14. A. Nobel, Histogram regression estimation using data-dependent partitions. Ann. Stat.24 (1996) 1084–1105.  
  15. M. Sauvé, Sélection de modèles en régression non gaussienne. Applications à la sélection de variables et aux tests de survie accélérés. Ph.D. thesis, Université Paris XI Orsay (2006).  
  16. M. Sauvé and C. Tuleau, Variable selection through CART. Research Report 5912, INRIA (2006).  

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.