A depth-based modification of the k-nearest neighbour method

Ondřej Vencálek; Daniel Hlubinka

Kybernetika (2021)

  • Issue: 1, page 15-37
  • ISSN: 0023-5954

Abstract

top
We propose a new nonparametric procedure to solve the problem of classifying objects represented by d -dimensional vectors into K 2 groups. The newly proposed classifier was inspired by the k nearest neighbour (kNN) method. It is based on the idea of a depth-based distributional neighbourhood and is called k nearest depth neighbours (kNDN) classifier. The kNDN classifier has several desirable properties: in contrast to the classical kNN, it can utilize global properties of the considered distributions (symmetry). In contrast to the maximal depth classifier and related classifiers, it does not have problems with classification when the considered distributions differ in dispersion or have unequal priors. The kNDN classifier is compared to several depth-based classifiers as well as the classical kNN method in a simulation study. According to the average misclassification rates, it is comparable to the best current depth-based classifiers.

How to cite

top

Vencálek, Ondřej, and Hlubinka, Daniel. "A depth-based modification of the k-nearest neighbour method." Kybernetika (2021): 15-37. <http://eudml.org/doc/297544>.

@article{Vencálek2021,
abstract = {We propose a new nonparametric procedure to solve the problem of classifying objects represented by $d$-dimensional vectors into $K \ge 2$ groups. The newly proposed classifier was inspired by the $k$ nearest neighbour (kNN) method. It is based on the idea of a depth-based distributional neighbourhood and is called $k$ nearest depth neighbours (kNDN) classifier. The kNDN classifier has several desirable properties: in contrast to the classical kNN, it can utilize global properties of the considered distributions (symmetry). In contrast to the maximal depth classifier and related classifiers, it does not have problems with classification when the considered distributions differ in dispersion or have unequal priors. The kNDN classifier is compared to several depth-based classifiers as well as the classical kNN method in a simulation study. According to the average misclassification rates, it is comparable to the best current depth-based classifiers.},
author = {Vencálek, Ondřej, Hlubinka, Daniel},
journal = {Kybernetika},
keywords = {Bayes classifier; data depth; k nearest depth neighbours; nonparametric},
language = {eng},
number = {1},
pages = {15-37},
publisher = {Institute of Information Theory and Automation AS CR},
title = {A depth-based modification of the k-nearest neighbour method},
url = {http://eudml.org/doc/297544},
year = {2021},
}

TY - JOUR
AU - Vencálek, Ondřej
AU - Hlubinka, Daniel
TI - A depth-based modification of the k-nearest neighbour method
JO - Kybernetika
PY - 2021
PB - Institute of Information Theory and Automation AS CR
IS - 1
SP - 15
EP - 37
AB - We propose a new nonparametric procedure to solve the problem of classifying objects represented by $d$-dimensional vectors into $K \ge 2$ groups. The newly proposed classifier was inspired by the $k$ nearest neighbour (kNN) method. It is based on the idea of a depth-based distributional neighbourhood and is called $k$ nearest depth neighbours (kNDN) classifier. The kNDN classifier has several desirable properties: in contrast to the classical kNN, it can utilize global properties of the considered distributions (symmetry). In contrast to the maximal depth classifier and related classifiers, it does not have problems with classification when the considered distributions differ in dispersion or have unequal priors. The kNDN classifier is compared to several depth-based classifiers as well as the classical kNN method in a simulation study. According to the average misclassification rates, it is comparable to the best current depth-based classifiers.
LA - eng
KW - Bayes classifier; data depth; k nearest depth neighbours; nonparametric
UR - http://eudml.org/doc/297544
ER -

References

top
  1. Agostinelli, C., Romanazzi, M., , J. Statist. Plann. Inference 141 (2011), 817-830. MR2732952DOI
  2. Barber, C. B., Dobkin, D. P., Huhdanpaa, H., , ACM Trans. Math. Software (TOMS) 22 (1996), 4, 469-483. MR1428265DOI
  3. Christmann, A., Rousseeuw, P. J., , Comput. Statist. Data Analysis 37 (2001), 65-75. MR1862480DOI
  4. Cox, L. H., Johnson, M. M., Kafadar, K., , In: ASA Proc Stat. Comp Section 1982, pp. 55-56. DOI
  5. Dutta, S., Ghosh, A. K., On classification based on Lp depth with an adaptive choice of p., Preprint, 2011. 
  6. Dutta, S., Ghosh, A. K., , Ann. Inst.Statist. Math. 64 (2012), 3, 657-676. MR2880873DOI
  7. Dutta, S., Ghosh, A. K., Chaudhuri, P., , Bernoulli 17 (2011), 4, 1420-1434. Zbl1229.62063MR2854779DOI
  8. Fix, E., Hodges, J. L., Discriminatory Analysis: Nonparametric Discrimination: Consistency Properties., Technical Report 4, Randolph Field, Texas: USAF School of Aviation Medicine, 1951. 
  9. Fraiman, R., Liu, R. Y., Meloche, J., 10.1214/lnms/1215454155, Lecture Notes-Monograph Series 1997, pp. 415-430. MR1833602DOI10.1214/lnms/1215454155
  10. Ghosh, A. K., Chaudhuri, P., , Scand. J. Statist. 32 (2005), 327-350. MR2188677DOI
  11. Ghosh, A. K., Chaudhuri, P., , Bernoulli 11 (2005), 1, 1-27. MR2121452DOI
  12. Habel, K., Grasman, R., Gramacy, R. B., Mozharovskyi, P., Sterratt, D. C., Geometry: Mesh Generation and Surface Tessellation. R package version 0.4.5. 
  13. Hlubinka, D., Kotík, L., Vencálek, O., Weighted data depth., Kybernetika 46 (2010), 1, 125-148. MR2666899
  14. Hubert, M., Veeken, S. van der, Fast and robust classifiers adjusted for skewness., In: COMPSTAT 2010: Proceedings in Computational Statistics: 19th Symposium held in Paris 2010 (Y. Lechevallier and G. Saporta, eds.), Springer, Heidelberg 2010, pp. 1135-1142. 
  15. Jörnsten, R., 10.1016/j.jmva.2004.02.013, J. Multivar. Anal. 90 (2004), 67-89. MR2064937DOI10.1016/j.jmva.2004.02.013
  16. Kotík, L., Hlubinka, D., , J. Multivar. Anal. 157 (2017), 53-69. MR3641736DOI
  17. Kosiorowski, D., Zawadzki, Z., DepthProc An R Package for Robust., Exploration of Multidimensional Economic Phenomena, 2020. 
  18. Lange, T., Mosler, K., Mozharovskyi, P., 10.1007/s00362-012-0488-4, Statist. Papers 55 (2014), 1, 49-69. MR3152767DOI10.1007/s00362-012-0488-4
  19. Li, J., Cuesta-Albertos, J. A., Liu, R. Y., , J. Amer. Statist. Assoc. 107 (2012), 498, 737-753. MR2980081DOI
  20. Liu, R. Y., , Ann. Statist. 18 (1990), 1, 405-414. MR1041400DOI
  21. Liu, R. Y., Parelius, J. M., Singh, K., , Ann. Statist. 27 (1999), 783-858. MR1724033DOI
  22. Mardia, K., Kent, J., Bibby, J., Multivariate Analysis., Academic Press, 1979. MR0560319
  23. Paindaveine, D., Bever, G. Van, 10.1080/01621459.2013.813390, J. Amer. Statist. Assoc. 105 (2013), 1105-1119. MR3174687DOI10.1080/01621459.2013.813390
  24. Paindaveine, D., Bever, G. Van, , Bernoulli 21 (2015), 1, 62-82. MR3322313DOI
  25. Pokotylo, O., Mozharovskyi, P., Dyckerhoff, R., 10.18637/jss.v091.i05, J. Statist. Software 91 (2019), 5, 1-46. DOI10.18637/jss.v091.i05
  26. Serfling, R., Depth functions in nonparametric multivariate inference., In: Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications (R. Y. Liu, R. Serfling, and D. L. Souvaine, eds.), American Mathematical Society, DIMACS Series in Discrete Mathematics and Theoretical Computer Science 7, New York 2006, pp. 1-16. MR2343109
  27. Vencalek, O., k -Depth-nearest neighbour method and its performance on skew-normal distributons., Acta Univ. Palacki Olomouc., Fac. Rer. Nat., Mathematica 52 (2013), 2, pp. 121-129. MR3202385
  28. Yeh, I. C., Yang, K. J., Ting, T. M., , Expert Systems Appl. 36 (2009), 5866-5871. DOI
  29. Zakai, A., Ritov, Y., Consistency and localizability., J. Machine Learning Res. 10 (2009), 827-856. MR2505136
  30. Zuo, Y., Serfling, R., , Ann. Statist. 28 (2000), 461-482. MR1790005DOI
  31. Zuo, Y., Serfling, R., , Ann. Statist. 28 (2000) 2, 483-499. MR1790006DOI

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.