Ridge estimation of covariance matrix from data in two classes

Yi Zhou; Bin Zhang

Applications of Mathematics (2024)

  • Volume: 69, Issue: 2, page 169-184
  • ISSN: 0862-7940

Abstract

top
This paper deals with the problem of estimating a covariance matrix from the data in two classes: (1) good data with the covariance matrix of interest and (2) contamination coming from a Gaussian distribution with a different covariance matrix. The ridge penalty is introduced to address the problem of high-dimensional challenges in estimating the covariance matrix from the two-class data model. A ridge estimator of the covariance matrix has a uniform expression and keeps positive-definite, whether the data size is larger or smaller than the data dimension. Furthermore, the ridge parameter is tuned through a cross-validation procedure. Lastly, the proposed ridge estimator is verified with better performance than the existing estimator from the data in two classes and the traditional ridge estimator only from the good data.

How to cite

top

Zhou, Yi, and Zhang, Bin. "Ridge estimation of covariance matrix from data in two classes." Applications of Mathematics 69.2 (2024): 169-184. <http://eudml.org/doc/299252>.

@article{Zhou2024,
abstract = {This paper deals with the problem of estimating a covariance matrix from the data in two classes: (1) good data with the covariance matrix of interest and (2) contamination coming from a Gaussian distribution with a different covariance matrix. The ridge penalty is introduced to address the problem of high-dimensional challenges in estimating the covariance matrix from the two-class data model. A ridge estimator of the covariance matrix has a uniform expression and keeps positive-definite, whether the data size is larger or smaller than the data dimension. Furthermore, the ridge parameter is tuned through a cross-validation procedure. Lastly, the proposed ridge estimator is verified with better performance than the existing estimator from the data in two classes and the traditional ridge estimator only from the good data.},
author = {Zhou, Yi, Zhang, Bin},
journal = {Applications of Mathematics},
keywords = {covariance matrix; ridge estimation; two-class data; contamination},
language = {eng},
number = {2},
pages = {169-184},
publisher = {Institute of Mathematics, Academy of Sciences of the Czech Republic},
title = {Ridge estimation of covariance matrix from data in two classes},
url = {http://eudml.org/doc/299252},
volume = {69},
year = {2024},
}

TY - JOUR
AU - Zhou, Yi
AU - Zhang, Bin
TI - Ridge estimation of covariance matrix from data in two classes
JO - Applications of Mathematics
PY - 2024
PB - Institute of Mathematics, Academy of Sciences of the Czech Republic
VL - 69
IS - 2
SP - 169
EP - 184
AB - This paper deals with the problem of estimating a covariance matrix from the data in two classes: (1) good data with the covariance matrix of interest and (2) contamination coming from a Gaussian distribution with a different covariance matrix. The ridge penalty is introduced to address the problem of high-dimensional challenges in estimating the covariance matrix from the two-class data model. A ridge estimator of the covariance matrix has a uniform expression and keeps positive-definite, whether the data size is larger or smaller than the data dimension. Furthermore, the ridge parameter is tuned through a cross-validation procedure. Lastly, the proposed ridge estimator is verified with better performance than the existing estimator from the data in two classes and the traditional ridge estimator only from the good data.
LA - eng
KW - covariance matrix; ridge estimation; two-class data; contamination
UR - http://eudml.org/doc/299252
ER -

References

top
  1. Ahsanullah, M., Nevzorov, V. B., 10.1016/S0378-3758(99)00067-1, J. Stat. Plann. Inference 85 (2000), 75-83. (2000) Zbl0968.62017MR1759240DOI10.1016/S0378-3758(99)00067-1
  2. Besson, O., 10.1016/j.sigpro.2019.107285, Signal Process. 167 (2020), Article ID 107285, 9 pages. (2020) DOI10.1016/j.sigpro.2019.107285
  3. Bhatia, R., 10.1515/9781400827787, Princeton Series in Applied Mathematics. Princeton University Press, Princeton (2007). (2007) Zbl1133.15017MR3443454DOI10.1515/9781400827787
  4. Bien, J., Tibshirani, R. J., 10.1093/biomet/asr054, Biometrika 98 (2011), 807-820. (2011) Zbl1228.62063MR2860325DOI10.1093/biomet/asr054
  5. Bodnar, O., Bodnar, T., Parolya, N., 10.1016/j.jmva.2021.104826, J. Multivariate Anal. 188 (2022), Article ID 104826, 13 pages. (2022) Zbl1493.62298MR4353848DOI10.1016/j.jmva.2021.104826
  6. Cho, S., Katayama, S., Lim, J., Choi, Y.-G., 10.1007/s10182-021-00396-7, AStA, Adv. Stat. Anal. 105 (2021), 601-627. (2021) Zbl1478.62118MR4340896DOI10.1007/s10182-021-00396-7
  7. Danaher, P., Wang, P., Witten, D. M., 10.1111/rssb.12033, J. R. Stat. Soc., Ser. B, Stat. Methodol. 76 (2014), 373-397. (2014) Zbl07555455MR3164871DOI10.1111/rssb.12033
  8. Fisher, T. J., Sun, X., 10.1016/j.csda.2010.12.006, Comput. Stat. Data Anal. 55 (2011), 1909-1918. (2011) Zbl1328.62336MR2765053DOI10.1016/j.csda.2010.12.006
  9. Götze, F., Tikhomirov, A., 10.3150/bj/1089206408, Bernoulli 10 (2004), 503-548. (2004) Zbl1049.60018MR2061442DOI10.3150/bj/1089206408
  10. Hannart, A., Naveau, P., 10.1016/j.jmva.2014.06.001, J. Multivariate Anal. 131 (2014), 149-162. (2014) Zbl1306.62120MR3252641DOI10.1016/j.jmva.2014.06.001
  11. Hoshino, N., Takemura, A., 10.2307/3318470, Bernoulli 6 (2000), 1035-1050. (2000) Zbl0979.65005MR1809734DOI10.2307/3318470
  12. Huang, C., Farewell, D., Pan, J., 10.1016/j.jmva.2017.03.001, J. Multivariate Anal. 157 (2017), 45-52. (2017) Zbl1362.62136MR3641735DOI10.1016/j.jmva.2017.03.001
  13. Huang, J. Z., Liu, N., Pourahmadi, M., Liu, L., 10.1093/biomet/93.1.85, Biometrika 93 (2006), 85-98. (2006) Zbl1152.62346MR2277742DOI10.1093/biomet/93.1.85
  14. Jia, S., Zhang, C., Lu, H., 10.1016/j.jmva.2021.104900, J. Multivariate Anal. 187 (2022), Article ID 104900, 14 pages. (2022) Zbl1480.62098MR4339021DOI10.1016/j.jmva.2021.104900
  15. Kalina, J., Tebbens, J. D., 10.5220/0005234901280133, Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms Scitepress, Setúbal (2015), 128-133. (2015) DOI10.5220/0005234901280133
  16. Kochan, N., Tütüncü, G. Y., Giner, G., 10.1016/j.eswa.2020.114200, Expert Systems Appl. 167 (2021), Article ID 114200, 5 pages. (2021) DOI10.1016/j.eswa.2020.114200
  17. Le, C. M., Levin, K., Bickel, P. J., Levina, E., 10.1080/00401706.2020.1796815, Technometrics 62 (2020), 443-446. (2020) MR4165992DOI10.1080/00401706.2020.1796815
  18. Ledoit, O., Wolf, M., 10.1016/S0047-259X(03)00096-4, J. Multivariate Anal. 88 (2004), 365-411. (2004) Zbl1032.62050MR2026339DOI10.1016/S0047-259X(03)00096-4
  19. Li, C.-N., Ren, P.-W., Guo, Y.-R., Ye, Y.-F., Shao, Y.-H., 10.1007/s10479-022-04959-y, (to appear) in Ann. Oper. Res. DOI10.1007/s10479-022-04959-y
  20. Lim, L.-H., Sepulchre, R., Ye, K., 10.1109/TIT.2019.2913874, IEEE Trans. Inf. Theory 65 (2019), 5401-5405. (2019) Zbl1432.15033MR4009241DOI10.1109/TIT.2019.2913874
  21. Massignan, J. A. D., London, J. B. A., Bessani, M., Maciel, C. D., Fannucchi, R. Z., Miranda, V., 10.1109/TSG.2021.3128053, IEEE Trans. Smart Grid 13 (2022), 526-540. (2022) DOI10.1109/TSG.2021.3128053
  22. Mestre, X., 10.1109/TSP.2008.929662, IEEE Trans. Signal Process. 56 (2008), 5353-5368. (2008) Zbl1391.62092MR2472837DOI10.1109/TSP.2008.929662
  23. Raninen, E., Ollila, E., 10.1109/TSP.2021.3118546, IEEE Trans. Signal Process. 69 (2021), 5681-5692. (2021) MR4332948DOI10.1109/TSP.2021.3118546
  24. Raninen, E., Tyler, D. E., Ollila, E., 10.1109/TSP.2021.3139207, IEEE Trans. Signal Process. 70 (2022), 659-672. (2022) MR4381805DOI10.1109/TSP.2021.3139207
  25. Scheidegger, C., Hörrmann, J., Bühlmann, P., The weighted generalised covariance measure, J. Mach. Learn. Res. 23 (2022), Article ID 273, 68 pages. (2022) MR4577712
  26. Tsukuma, H., Kubokawa, T., 10.1016/j.jmva.2015.09.016, J. Multivariate Anal. 143 (2016), 233-248. (2016) Zbl1328.62348MR3431430DOI10.1016/j.jmva.2015.09.016
  27. Wieringen, W. N. van, Peeters, C. F. W., 10.1016/j.csda.2016.05.012, Comput. Stat. Data Anal. 103 (2016), 284-303. (2016) Zbl1466.62204MR3522633DOI10.1016/j.csda.2016.05.012
  28. Vershynin, R., 10.1007/s10959-010-0338-z, J. Theor. Probab. 25 (2012), 655-686. (2012) Zbl1365.62208MR2956207DOI10.1007/s10959-010-0338-z
  29. Wang, H., Peng, B., Li, D., Leng, C., 10.1016/j.jeconom.2020.09.002, J. Econom. 223 (2021), 53-72. (2021) Zbl1471.62378MR4252147DOI10.1016/j.jeconom.2020.09.002
  30. Warton, D. I., 10.1198/016214508000000021, J. Am. Stat. Assoc. 103 (2008), 340-349. (2008) Zbl1471.62362MR2394637DOI10.1198/016214508000000021
  31. Witten, D. M., Tibshirani, R., 10.1111/j.1467-9868.2009.00699.x, J. R. Stat. Soc., Ser. B, Stat. Methodol. 71 (2009), 615-636. (2009) Zbl1250.62033MR2749910DOI10.1111/j.1467-9868.2009.00699.x
  32. Xi, B., Li, J., Li, Y., Song, R., Hong, D., Chanussot, J., 10.1109/TIP.2022.3192712, IEEE Trans. Image Process. 31 (2022), 5079-5092. (2022) DOI10.1109/TIP.2022.3192712
  33. Xue, L., Ma, S., Zou, H., 10.1080/01621459.2012.725386, J. Am. Stat. Assoc. 107 (2012), 1480-1491. (2012) Zbl1258.62063MR3036409DOI10.1080/01621459.2012.725386
  34. Yang, Y., Zhou, J., Pan, J., 10.1016/j.jmva.2021.104739, J. Multivariate Anal. 184 (2021), Article ID 104739, 17 pages. (2021) Zbl1467.62095MR4236460DOI10.1016/j.jmva.2021.104739
  35. Yin, Y., 10.3150/21-BEJ1391, Bernoulli 28 (2022), 1729-1756. (2022) Zbl07526604MR4411509DOI10.3150/21-BEJ1391
  36. Yuasa, R., Kubokawa, T., 10.1016/j.jmva.2020.104608, J. Multivariate Anal. 178 (2020), Article ID 104608, 18 pages. (2020) Zbl1440.62036MR4079038DOI10.1016/j.jmva.2020.104608
  37. Zhang, H., Jia, J., 10.5705/ss.202019.0315, Stat. Sin. 32 (2022), 181-207. (2022) Zbl07484115MR4359629DOI10.5705/ss.202019.0315
  38. Zhang, Y., Zhou, Y., Liu, X., 10.1016/j.csda.2022.107617, Comput. Stat. Data Anal. 178 (2023), Article ID 107617, 19 pages. (2023) Zbl07626679MR4483317DOI10.1016/j.csda.2022.107617

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.