A Bimodality Test in High Dimensions
Serdica Journal of Computing (2012)
- Volume: 6, Issue: 4, page 437-450
- ISSN: 1312-6555
Access Full Article
topAbstract
topHow to cite
topPalejev, Dean. "A Bimodality Test in High Dimensions." Serdica Journal of Computing 6.4 (2012): 437-450. <http://eudml.org/doc/250976>.
@article{Palejev2012,
abstract = {We present a test for identifying clusters in high dimensional
data based on the k-means algorithm when the null hypothesis is spherical
normal. We show that projection techniques used for evaluating validity of
clusters may be misleading for such data. In particular, we demonstrate
that increasingly well-separated clusters are identified as the dimensionality
increases, when no such clusters exist. Furthermore, in a case of true
bimodality, increasing the dimensionality makes identifying the correct clusters more difficult.
In addition to the original conservative test, we propose a practical test with the same asymptotic behavior that performs well for a
moderate number of points and moderate dimensionality. ACM Computing Classification System (1998): I.5.3.},
author = {Palejev, Dean},
journal = {Serdica Journal of Computing},
keywords = {Clustering; Bimodality; Multidimensional Space; Asymptotic Test; clustering; bimodality; multidimensional space; asymptotic test},
language = {eng},
number = {4},
pages = {437-450},
publisher = {Institute of Mathematics and Informatics Bulgarian Academy of Sciences},
title = {A Bimodality Test in High Dimensions},
url = {http://eudml.org/doc/250976},
volume = {6},
year = {2012},
}
TY - JOUR
AU - Palejev, Dean
TI - A Bimodality Test in High Dimensions
JO - Serdica Journal of Computing
PY - 2012
PB - Institute of Mathematics and Informatics Bulgarian Academy of Sciences
VL - 6
IS - 4
SP - 437
EP - 450
AB - We present a test for identifying clusters in high dimensional
data based on the k-means algorithm when the null hypothesis is spherical
normal. We show that projection techniques used for evaluating validity of
clusters may be misleading for such data. In particular, we demonstrate
that increasingly well-separated clusters are identified as the dimensionality
increases, when no such clusters exist. Furthermore, in a case of true
bimodality, increasing the dimensionality makes identifying the correct clusters more difficult.
In addition to the original conservative test, we propose a practical test with the same asymptotic behavior that performs well for a
moderate number of points and moderate dimensionality. ACM Computing Classification System (1998): I.5.3.
LA - eng
KW - Clustering; Bimodality; Multidimensional Space; Asymptotic Test; clustering; bimodality; multidimensional space; asymptotic test
UR - http://eudml.org/doc/250976
ER -
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.