Adaptive Dantzig density estimation

K. Bertin; E. Le Pennec; V. Rivoirard

Annales de l'I.H.P. Probabilités et statistiques (2011)

  • Volume: 47, Issue: 1, page 43-74
  • ISSN: 0246-0203

Abstract

top
The aim of this paper is to build an estimate of an unknown density as a linear combination of functions of a dictionary. Inspired by Candès and Tao’s approach, we propose a minimization of the ℓ1-norm of the coefficients in the linear combination under an adaptive Dantzig constraint coming from sharp concentration inequalities. This allows to consider a wide class of dictionaries. Under local or global structure assumptions, oracle inequalities are derived. These theoretical results are transposed to the adaptive Lasso estimate naturally associated to our Dantzig procedure. Then, the issue of calibrating these procedures is studied from both theoretical and practical points of view. Finally, a numerical study shows the significant improvement obtained by our procedures when compared with other classical procedures.

How to cite

top

Bertin, K., Le Pennec, E., and Rivoirard, V.. "Adaptive Dantzig density estimation." Annales de l'I.H.P. Probabilités et statistiques 47.1 (2011): 43-74. <http://eudml.org/doc/240384>.

@article{Bertin2011,
abstract = {The aim of this paper is to build an estimate of an unknown density as a linear combination of functions of a dictionary. Inspired by Candès and Tao’s approach, we propose a minimization of the ℓ1-norm of the coefficients in the linear combination under an adaptive Dantzig constraint coming from sharp concentration inequalities. This allows to consider a wide class of dictionaries. Under local or global structure assumptions, oracle inequalities are derived. These theoretical results are transposed to the adaptive Lasso estimate naturally associated to our Dantzig procedure. Then, the issue of calibrating these procedures is studied from both theoretical and practical points of view. Finally, a numerical study shows the significant improvement obtained by our procedures when compared with other classical procedures.},
author = {Bertin, K., Le Pennec, E., Rivoirard, V.},
journal = {Annales de l'I.H.P. Probabilités et statistiques},
keywords = {calibration; concentration inequalities; Dantzig estimate; density estimation; dictionary; Lasso estimate; oracle inequalities; sparsity; lasso estimate},
language = {eng},
number = {1},
pages = {43-74},
publisher = {Gauthier-Villars},
title = {Adaptive Dantzig density estimation},
url = {http://eudml.org/doc/240384},
volume = {47},
year = {2011},
}

TY - JOUR
AU - Bertin, K.
AU - Le Pennec, E.
AU - Rivoirard, V.
TI - Adaptive Dantzig density estimation
JO - Annales de l'I.H.P. Probabilités et statistiques
PY - 2011
PB - Gauthier-Villars
VL - 47
IS - 1
SP - 43
EP - 74
AB - The aim of this paper is to build an estimate of an unknown density as a linear combination of functions of a dictionary. Inspired by Candès and Tao’s approach, we propose a minimization of the ℓ1-norm of the coefficients in the linear combination under an adaptive Dantzig constraint coming from sharp concentration inequalities. This allows to consider a wide class of dictionaries. Under local or global structure assumptions, oracle inequalities are derived. These theoretical results are transposed to the adaptive Lasso estimate naturally associated to our Dantzig procedure. Then, the issue of calibrating these procedures is studied from both theoretical and practical points of view. Finally, a numerical study shows the significant improvement obtained by our procedures when compared with other classical procedures.
LA - eng
KW - calibration; concentration inequalities; Dantzig estimate; density estimation; dictionary; Lasso estimate; oracle inequalities; sparsity; lasso estimate
UR - http://eudml.org/doc/240384
ER -

References

top
  1. [1] S. Arlot and P. Massart. Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10 (2009) 245–279. 
  2. [2] M. S. Asif and J. Romberg. Dantzig selector homotopy with dynamic measurements. In Proceedings of SPIE Computational Imaging VII 7246 (2009) 72460E. 
  3. [3] P. Bickel, Y. Ritov and A. Tsybakov. Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 (2009) 1705–1732. Zbl1173.62022MR2533469
  4. [4] L. Birgé. Model selection for density estimation with L2-loss, 2008. Available at arXiv 0808.1416. 
  5. [5] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection. Probab. Theory Related. Fields 138 (2007) 33–73. Zbl1112.62082MR2288064
  6. [6] F. Bunea, A. Tsybakov and M. Wegkamp. Aggregation and sparsity via ℓ1 penalized least squares. In Learning Theory 379–391. Lecture Notes in Comput. Sci. 4005. Springer, Berlin, 2006. Zbl1143.62319MR2280619
  7. [7] F. Bunea, A. Tsybakov and M. Wegkamp. Sparse density estimation with ℓ1 penalties. Learning Theory 530–543. Lecture Notes in Comput. Sci. 4539. Springer, Berlin, 2007. Zbl1203.62053
  8. [8] F. Bunea, A. Tsybakov and M. Wegkamp. Sparsity oracle inequalities for the LASSO. Electron. J. Statist. 1 (2007) 169–194. Zbl1146.62028MR2312149
  9. [9] F. Bunea, A. Tsybakov and M. Wegkamp. Aggregation for Gaussian regression. Ann. Statist. 35 (2007) 1674–1697. Zbl1209.62065MR2351101
  10. [10] F. Bunea, A. Tsybakov, M. Wegkamp and A. Barbu. Spades and mixture models. Ann. Statist. (2010). To appear. Available at arXiv 0901.2044. Zbl1198.62025MR2676897
  11. [11] F. Bunea. Consistent selection via the Lasso for high dimensional approximating regression models. In Pushing the Limits of Contemporary Statistics: Cartributions in Honor of J. K. Ghosh 122–137. Inst. Math. Stat. Collect 3. IMS, Beachwood, OH, 2008. MR2459221
  12. [12] E. Candès and Y. Plan. Near-ideal model selection by ℓ1 minimization. Ann. Statist. 37 (2009) 2145–2177. Zbl1173.62053MR2543688
  13. [13] E. Candès and T. Tao. The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35 (2007) 2313–2351. Zbl1139.62019MR2382644
  14. [14] D. Chen, D. Donoho and M. Saunders. Atomic decomposition by basis pursuit. SIAM Rev. 43 (2001) 129–159. Zbl0979.94010MR1854649
  15. [15] D. Donoho, M. Elad and V. Temlyakov. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inform. Theory 52 (2006) 6–18. Zbl1288.94017MR2237332
  16. [16] D. Donoho and I. Johnstone. Ideal spatial adaptation via wavelet shrinkage. Biometrika 81 (1994) 425–455. Zbl0815.62019MR1311089
  17. [17] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani. Least angle regression. Ann. Statist. 32 (2004) 407–499. Zbl1091.62054MR2060166
  18. [18] A. Juditsky and S. Lambert-Lacroix. On minimax density estimation on ℝ. Bernoulli 10 (2004) 187–220. Zbl1076.62037MR2046772
  19. [19] K. Knight and W. Fu. Asymptotics for Lasso-type estimators. Ann. Statist. 28 (2000) 1356–1378. Zbl1105.62357MR1805787
  20. [20] K. Lounici. Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 (2008) 90–102. Zbl1306.62155MR2386087
  21. [21] P. Massart. Concentration inequalities and model selection. Lecture Notes in Math. 1896. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour July 6–23 2003, 2007. Zbl1170.60006MR2319879
  22. [22] N. Meinshausen and P. Buhlmann. High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 (2006) 1436–1462. Zbl1113.62082MR2278363
  23. [23] N. Meinshausen and B. Yu. Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 (2009) 246–270. Zbl1155.62050MR2488351
  24. [24] M. Osborne, B. Presnell and B. Turlach. On the Lasso and its dual. J. Comput. Graph. Statist. 9 (2000) 319–337. MR1822089
  25. [25] M. Osborne, B. Presnell and B. Turlach. A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20 (2000) 389–404. Zbl0962.65036MR1773265
  26. [26] P. Reynaud-Bouret and V. Rivoirard. Near optimal thresholding estimation of a Poisson intensity on the real line. Electron. J. Statist. 4 (2010) 172–238. Zbl1329.62176MR2645482
  27. [27] P. Reynaud-Bouret, V. Rivoirard and C. Tuleau. Adaptive density estimation: A curse of support? 2009. Available at arXiv 0907.1794. Zbl1197.62033
  28. [28] R. Tibshirani. Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 (1996) 267–288. Zbl0850.62538MR1379242
  29. [29] S. van de Geer. High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 (2008) 614–645. Zbl1138.62323MR2396809
  30. [30] B. Yu and P. Zhao. On model selection consistency of Lasso estimators. J. Mach. Learn. Res. 7 (2006) 2541–2567. Zbl1222.62008MR2274449
  31. [31] C. Zhang and J. Huang. The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann. Statist. 36 (2008) 1567–1594. Zbl1142.62044MR2435448
  32. [32] H. Zou. The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 (2006) 1418–1429. Zbl1171.62326MR2279469

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.