Effect of choice of dissimilarity measure on classification efficiency with nearest neighbor method

Tomasz Górecki

Effect of choice of dissimilarity measure on classification efficiency with nearest neighbor method

Tomasz Górecki

Discussiones Mathematicae Probability and Statistics (2005)

Volume: 25, Issue: 2, page 217-239
ISSN: 1509-9423

Access Full Article

top

Access to full text

Full (PDF)

Abstract

top

In this paper we will precisely analyze the nearest neighbor method for different dissimilarity measures, classical and weighed, for which methods of distinguishing were worked out. We will propose looking for weights in the space of discriminant coordinates. Experimental results based on a number of real data sets are presented and analyzed to illustrate the benefits of the proposed methods. As classical dissimilarity measures we will use the Euclidean metric, Manhattan and post office metric. We gave the first two metrics weights and now these measures are not metrics because the triangle inequality does not hold. Howeover, it does not make them useless for the nearest neighbor classification method. Additionally, we will analyze different methods of tie-breaking.

How to cite

top

MLA
BibTeX
RIS

Tomasz Górecki. "Effect of choice of dissimilarity measure on classification efficiency with nearest neighbor method." Discussiones Mathematicae Probability and Statistics 25.2 (2005): 217-239. <http://eudml.org/doc/287745>.

@article{TomaszGórecki2005,
abstract = {In this paper we will precisely analyze the nearest neighbor method for different dissimilarity measures, classical and weighed, for which methods of distinguishing were worked out. We will propose looking for weights in the space of discriminant coordinates. Experimental results based on a number of real data sets are presented and analyzed to illustrate the benefits of the proposed methods. As classical dissimilarity measures we will use the Euclidean metric, Manhattan and post office metric. We gave the first two metrics weights and now these measures are not metrics because the triangle inequality does not hold. Howeover, it does not make them useless for the nearest neighbor classification method. Additionally, we will analyze different methods of tie-breaking.},
author = {Tomasz Górecki},
journal = {Discussiones Mathematicae Probability and Statistics},
keywords = {nearest neighbor method; discriminant coordinates; dissimilarity measures; estimators of classification error},
language = {eng},
number = {2},
pages = {217-239},
title = {Effect of choice of dissimilarity measure on classification efficiency with nearest neighbor method},
url = {http://eudml.org/doc/287745},
volume = {25},
year = {2005},
}

TY - JOUR
AU - Tomasz Górecki
TI - Effect of choice of dissimilarity measure on classification efficiency with nearest neighbor method
JO - Discussiones Mathematicae Probability and Statistics
PY - 2005
VL - 25
IS - 2
SP - 217
EP - 239
AB - In this paper we will precisely analyze the nearest neighbor method for different dissimilarity measures, classical and weighed, for which methods of distinguishing were worked out. We will propose looking for weights in the space of discriminant coordinates. Experimental results based on a number of real data sets are presented and analyzed to illustrate the benefits of the proposed methods. As classical dissimilarity measures we will use the Euclidean metric, Manhattan and post office metric. We gave the first two metrics weights and now these measures are not metrics because the triangle inequality does not hold. Howeover, it does not make them useless for the nearest neighbor classification method. Additionally, we will analyze different methods of tie-breaking.
LA - eng
KW - nearest neighbor method; discriminant coordinates; dissimilarity measures; estimators of classification error
UR - http://eudml.org/doc/287745
ER -

References

top

[1] C. Blake, E. Keogh and C. Merz, UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/ mlearn/MLRepository.html, Univeristy of California, Irvine, Department of Information and Computer Sciences.
[2] T. Cover and P. Hart, Nearest neighbor pattern classification, IEEE Trans. Information Theory 13 (1) (1967), 21-27. Zbl0154.44505
[3] L. Devroye, L. Gy[o]rfi and G. Lugosi, Probabilistic Theory of Pattern Recognition, Springer New York 1996.
[4] R. Gnanadeskian, Methods for Statistical Data Analysis of Multivariate Observations, John Wiley & Sons London Second, New York 1997.
[5] R.A. Johnson and D.W. Wichern, Applied Multivariate Statistical Analysis, Prentice-Hall, New Jersey 1982. Zbl0499.62002
[6] W.J. Krzanowski and F.H.C. Marriott, Multivariate Analysis, 1 Distributions, Ordination and Inference, Edward Arnold London 1994. Zbl0855.62036
[7] W.J. Krzanowski and F.H.C. Marriott, Multivariate Analysis, 2 Classification, Covariance Structures and Repeated Measurements, London 1995. Zbl0949.62537
[8] D.F. Morrison, Multivariate statistical analysis, PWN, Warszawa 1990.
[9] R. Paredes and E. Vidal, A class-dependent weighted dissimilarity measure for nearest neighbor classification problems, Pattern Recognition Letters 21 (2000), 1027-1036. Zbl0967.68143
[10] G.A.F. Seber, Multivariate Observations, John Wiley & Sons, New York 1984. Zbl0627.62052

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.