Recursive bias estimation for multivariate regression smoothers

Pierre-André Cornillon; N. W. Hengartner; E. Matzner-Løber

ESAIM: Probability and Statistics (2014)

  • Volume: 18, page 483-502
  • ISSN: 1292-8100

Abstract

top
This paper presents a practical and simple fully nonparametric multivariate smoothing procedure that adapts to the underlying smoothness of the true regression function. Our estimator is easily computed by successive application of existing base smoothers (without the need of selecting an optimal smoothing parameter), such as thin-plate spline or kernel smoothers. The resulting smoother has better out of sample predictive capabilities than the underlying base smoother, or competing structurally constrained models (MARS, GAM) for small dimension (3 ≤ d ≤ 7) and moderate sample size n ≤ 1000. Moreover our estimator is still useful when d > 10 and to our knowledge, no other adaptive fully nonparametric regression estimator is available without constrained assumption such as additivity for example. On a real example, the Boston Housing Data, our method reduces the out of sample prediction error by 20%. An R package ibr, available at CRAN, implements the proposed multivariate nonparametric method in R.

How to cite

top

Cornillon, Pierre-André, Hengartner, N. W., and Matzner-Løber, E.. "Recursive bias estimation for multivariate regression smoothers." ESAIM: Probability and Statistics 18 (2014): 483-502. <http://eudml.org/doc/274395>.

@article{Cornillon2014,
abstract = {This paper presents a practical and simple fully nonparametric multivariate smoothing procedure that adapts to the underlying smoothness of the true regression function. Our estimator is easily computed by successive application of existing base smoothers (without the need of selecting an optimal smoothing parameter), such as thin-plate spline or kernel smoothers. The resulting smoother has better out of sample predictive capabilities than the underlying base smoother, or competing structurally constrained models (MARS, GAM) for small dimension (3 ≤ d ≤ 7) and moderate sample size n ≤ 1000. Moreover our estimator is still useful when d &gt; 10 and to our knowledge, no other adaptive fully nonparametric regression estimator is available without constrained assumption such as additivity for example. On a real example, the Boston Housing Data, our method reduces the out of sample prediction error by 20%. An R package ibr, available at CRAN, implements the proposed multivariate nonparametric method in R.},
author = {Cornillon, Pierre-André, Hengartner, N. W., Matzner-Løber, E.},
journal = {ESAIM: Probability and Statistics},
keywords = {nonparametric regression; smoother; kernel; thin-plate splines; stopping rules},
language = {eng},
pages = {483-502},
publisher = {EDP-Sciences},
title = {Recursive bias estimation for multivariate regression smoothers},
url = {http://eudml.org/doc/274395},
volume = {18},
year = {2014},
}

TY - JOUR
AU - Cornillon, Pierre-André
AU - Hengartner, N. W.
AU - Matzner-Løber, E.
TI - Recursive bias estimation for multivariate regression smoothers
JO - ESAIM: Probability and Statistics
PY - 2014
PB - EDP-Sciences
VL - 18
SP - 483
EP - 502
AB - This paper presents a practical and simple fully nonparametric multivariate smoothing procedure that adapts to the underlying smoothness of the true regression function. Our estimator is easily computed by successive application of existing base smoothers (without the need of selecting an optimal smoothing parameter), such as thin-plate spline or kernel smoothers. The resulting smoother has better out of sample predictive capabilities than the underlying base smoother, or competing structurally constrained models (MARS, GAM) for small dimension (3 ≤ d ≤ 7) and moderate sample size n ≤ 1000. Moreover our estimator is still useful when d &gt; 10 and to our knowledge, no other adaptive fully nonparametric regression estimator is available without constrained assumption such as additivity for example. On a real example, the Boston Housing Data, our method reduces the out of sample prediction error by 20%. An R package ibr, available at CRAN, implements the proposed multivariate nonparametric method in R.
LA - eng
KW - nonparametric regression; smoother; kernel; thin-plate splines; stopping rules
UR - http://eudml.org/doc/274395
ER -

References

top
  1. [1] B. Abdous, Computationally efficient classes of higher-order kernel functions. Can. J. Statist.23 (1995) 21–27. Zbl0819.62031MR1340959
  2. [2] L. Breiman, Using adaptive bagging to debias regressions. Technical Report 547, Dpt of Statist., UC Berkeley (1999). Zbl1052.68109
  3. [3] L. Breiman and J. Friedman, Estimating optimal transformation for multiple regression and correlation. J. Amer. Stat. Assoc.80 (1995) 580–598. Zbl0594.62044MR803258
  4. [4] P. Bühlmann and B. Yu, Boosting with the l2 loss: Regression and classification. J. Amer. Stat. Assoc.98 (2003) 324–339. Zbl1041.62029MR1995709
  5. [5] P.-A. Cornillon, N. Hengartner and E. Matzner-Løber, Recursive bias estimation and l2 boosting. Technical report, ArXiv:0801.4629 (2008). 
  6. [6] P.-A. Cornillon, N. Hengartner and Matzner-Løber, ibr: Iterative Bias Reduction. CRAN (2010). http://cran.r-project.org/web/packages/ibr/index.html. 
  7. [7] P.-A. Cornillon, N. Hengartner, N. Jégou and Matzner-Løber, Iterative bias reduction: a comparative study. Statist. Comput. (2012). Zbl1322.62131
  8. [8] P. Craven and G. Wahba, Smoothing noisy data with spline functions. Numer. Math.31 (1979) 377–403. Zbl0377.65007MR516581
  9. [9] M. Di Marzio and C. Taylor, On boosting kernel regression. J. Statist. Plan. Infer.138 (2008) 2483–2498. Zbl1182.62091MR2432380
  10. [10] R. Eubank, Nonparametric regression and spline smoothing. Dekker, 2nd edition (1999). Zbl0936.62044MR1680784
  11. [11] W. Feller, An introduction to probability and its applications, vol. 2. Wiley (1966). Zbl0039.13201MR210154
  12. [12] J. Friedman, Multivariate adaptive regression splines. Ann. Statist.19 (1991) 337–407. Zbl0765.62064MR1091842
  13. [13] J. Friedman, Greedy function approximation: A gradient boosting machine. Ann. Statist. 28 (1189–1232) (2001). Zbl1043.62034MR1873328
  14. [14] J. Friedman and W. Stuetzle, Projection pursuit regression. J. Amer. Statist. Assoc. 76 (817–823) (1981). MR650892
  15. [15] J. Friedman, T. Hastie and R. Tibshirani, Additive logistic regression: a statistical view of boosting. Ann. Statist.28 (2000) 337–407. Zbl1106.62323MR1790002
  16. [16] C. Gu, Smoothing spline ANOVA models. Springer (2002). Zbl1269.62040MR1876599
  17. [17] L. Gyorfi, M. Kohler, A. Krzyzak and H. Walk, A Distribution-Free Theory of Nonparametric Regression. Springer Verlag (2002). Zbl1021.62024MR1920390
  18. [18] D. Harrison and D. Rubinfeld, Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. (1978) 81–102. Zbl0375.90023
  19. [19] T. Hastie and R. Tibshirani, Generalized Additive Models. Chapman & Hall (1995). Zbl0747.62061MR1082147
  20. [20] R.A. Horn and C.R. Johnson, Matrix analysis. Cambridge (1985). Zbl1267.15001MR832183
  21. [21] C. Hurvich, G. Simonoff and C.L. Tsai, Smoothing parameter selection in nonparametric regression using and improved akaike information criterion. J. Roy. Stat. Soc. B60 (1998) 271–294. Zbl0909.62039MR1616041
  22. [22] O. Lepski, Asymptotically minimax adaptive estimation. I: upper bounds. optimally adaptive estimates. Theory Probab. Appl. 37 (1991) 682–697. Zbl0776.62039MR1147167
  23. [23] K.-C. Li, Asymptotic optimality for Cp, CL, cross-validation and generalized cross-validation: Discrete index set. Ann. Statist. 15 (1987) 958–975. Zbl0653.62037MR902239
  24. [24] G. Ridgeway, Additive logistic regression: a statistical view of boosting: Discussion. Ann. Statist.28 (2000) 393–400. Zbl1106.62323MR1790002
  25. [25] L. Schwartz, Analyse IV applications à la théorie de la mesure. Hermann (1993). Zbl0920.00003
  26. [26] W. Stuetzle and Y. Mittal, Some comments on the asymptotic behavior of robust smoothers, in Smoothing Techniques for Curve Estimation, edited by T. Gasser and M. Rosenblatt. Springer-Verlag (1979) 191–195. Zbl0421.62022MR564259
  27. [27] J. Tukey, Explanatory Data Analysis. Addison-Wesley (1977). Zbl0409.62003
  28. [28] F. Utreras, Convergence rates for multivariate smoothing spline functions. J. Approx. Theory (1988) 1–27. Zbl0646.41006MR922591
  29. [29] J. Wendelberger, Smoothing Noisy Data with Multivariate Splines and Generalized Cross-Validation. PhD thesis, University of Wisconsin (1982). MR2632494
  30. [30] S. Wood, Thin plate regression splines. J. R. Statist. Soc. B65 (2003) 95–114. Zbl1063.62059MR1959095
  31. [31] Y. Yang, Combining different procedures for adaptive regression. J. Mult. Analysis74 (2000) 135–161. Zbl0964.62032MR1790617

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.