Interpretation and optimization of the -means algorithm
Kristian Sabo; Rudolf Scitovski
Applications of Mathematics (2014)
- Volume: 59, Issue: 4, page 391-406
- ISSN: 0862-7940
Access Full Article
topAbstract
topHow to cite
topSabo, Kristian, and Scitovski, Rudolf. "Interpretation and optimization of the $k$-means algorithm." Applications of Mathematics 59.4 (2014): 391-406. <http://eudml.org/doc/261931>.
@article{Sabo2014,
abstract = {The paper gives a new interpretation and a possible optimization of the well-known $k$-means algorithm for searching for a locally optimal partition of the set $\mathcal \{A\}= \lbrace a_i\in \mathbb \{R\}^n\colon i=1,\dots ,m\rbrace $ which consists of $k$ disjoint nonempty subsets $\pi _1,\dots ,\pi _k$, $1\le k\le m$. For this purpose, a new divided $k$-means algorithm was constructed as a limit case of the known smoothed $k$-means algorithm. It is shown that the algorithm constructed in this way coincides with the $k$-means algorithm if during the iterative procedure no data points appear in the Voronoi diagram. If in the partition obtained by applying the divided $k$-means algorithm there are data points lying in the Voronoi diagram, it is shown that the obtained result can be improved further.},
author = {Sabo, Kristian, Scitovski, Rudolf},
journal = {Applications of Mathematics},
keywords = {clustering; data mining; $k$-means; Voronoi diagram; clustering; -means; Voronoi diagram},
language = {eng},
number = {4},
pages = {391-406},
publisher = {Institute of Mathematics, Academy of Sciences of the Czech Republic},
title = {Interpretation and optimization of the $k$-means algorithm},
url = {http://eudml.org/doc/261931},
volume = {59},
year = {2014},
}
TY - JOUR
AU - Sabo, Kristian
AU - Scitovski, Rudolf
TI - Interpretation and optimization of the $k$-means algorithm
JO - Applications of Mathematics
PY - 2014
PB - Institute of Mathematics, Academy of Sciences of the Czech Republic
VL - 59
IS - 4
SP - 391
EP - 406
AB - The paper gives a new interpretation and a possible optimization of the well-known $k$-means algorithm for searching for a locally optimal partition of the set $\mathcal {A}= \lbrace a_i\in \mathbb {R}^n\colon i=1,\dots ,m\rbrace $ which consists of $k$ disjoint nonempty subsets $\pi _1,\dots ,\pi _k$, $1\le k\le m$. For this purpose, a new divided $k$-means algorithm was constructed as a limit case of the known smoothed $k$-means algorithm. It is shown that the algorithm constructed in this way coincides with the $k$-means algorithm if during the iterative procedure no data points appear in the Voronoi diagram. If in the partition obtained by applying the divided $k$-means algorithm there are data points lying in the Voronoi diagram, it is shown that the obtained result can be improved further.
LA - eng
KW - clustering; data mining; $k$-means; Voronoi diagram; clustering; -means; Voronoi diagram
UR - http://eudml.org/doc/261931
ER -
References
top- Aurenhammer, F., Klein, R., Voronoi Diagrams, Handbook of Computational Geometry J.-R. Sack et al. North-Holland Amsterdam (2000), 201-290. (2000) Zbl0995.65024MR1746678
- Bagirov, A. M., 10.1016/j.patcog.2008.04.004, Pattern Recognition 41 (2008), 3192-3199. (2008) Zbl1147.68669DOI10.1016/j.patcog.2008.04.004
- Bagirov, A. M., Ugon, J., Webb, D., 10.1016/j.patcog.2010.10.018, Pattern Recognition 44 (2011), 866-876. (2011) Zbl1213.68514DOI10.1016/j.patcog.2010.10.018
- Dennis, J. E. jun, Schnabel, R. B., Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Classics in Applied Mathematics 16 SIAM, Philadelphia (1996). (1996) MR1376139
- Finkel, D. E., DIRECT Optimization Algorithm User Guide, Center for Research in Scientific Computation, North Carolina State University (2003), http://www4.ncsu.edu/ definkel/research/index.html.
- Floudas, C. A., Gounaris, C. E., 10.1007/s10898-008-9332-8, J. Glob. Optim. 45 (2009), 3-38. (2009) Zbl1180.90245MR2533740DOI10.1007/s10898-008-9332-8
- Gan, G., Ma, C., Wu, J., Data Clustering: Theory, Algorithms, and Applications, ASA-SIAM Series on Statistics and Applied Probability 20 SIAM, Philadelphia (2007). (2007) Zbl1185.68274MR2331172
- Grbić, R., Nyarko, E. K., Scitovski, R., 10.1007/s10898-012-0020-3, J. Glob. Optim. 57 (2013), 1193-1212. (2013) Zbl1279.65076MR3121789DOI10.1007/s10898-012-0020-3
- Iyigun, C., Probabilistic Distance Clustering, Ph.D. thesis, Graduate School-New Brunswick Rutgers (2007). (2007) MR2429670
- Iyigun, C., Ben-Israel, A., 10.1016/j.orl.2009.11.005, Oper. Res. Lett. 38 (2010), 207-214. (2010) Zbl1188.90148MR2608859DOI10.1016/j.orl.2009.11.005
- Jones, D. R., Perttunen, C. D., Stuckman, B. E., 10.1007/BF00941892, J. Optimization Theory Appl. 79 (1993), 157-181. (1993) Zbl0796.49032MR1246501DOI10.1007/BF00941892
- Kogan, J., Introduction to Clustering Large and High-Dimensional Data, Cambridge University Press Cambridge (2007). (2007) Zbl1183.62106MR2297007
- Kogan, J., Nicholas, C., Wiacek, M., Hybrid clustering of large high dimensional data, Proceedings of the Workshop on Text Mining M. Castellanos et al. SIAM (2007). (2007)
- Kogan, J., Teboulle, M., Scaling clustering algorithms with Bregman distances, Proceedings of the Workshop on Text Mining (M. W. Berry et al., eds.), 2006.
- Leisch, F., 10.1016/j.csda.2005.10.006, Comput. Stat. Data Anal. 51 (2006), 526-544. (2006) Zbl1157.62439MR2297469DOI10.1016/j.csda.2005.10.006
- Likas, A., Vlassis, N., Verbeek, J. J., 10.1016/S0031-3203(02)00060-2, Pattern Recognition 36 (2003), 451-461. (2003) DOI10.1016/S0031-3203(02)00060-2
- Ng, M. K., A note on constrained -means algorithms, Pattern Recognition 33 (2000), 525-519. (2000)
- Pintér, J. D., 10.1007/978-1-4757-2502-5_13, Nonconvex Optimization and Its Applications 6 Kluwer Academic Publishers, Dordrecht (1996). (1996) Zbl0842.90110MR1374104DOI10.1007/978-1-4757-2502-5_13
- Sabo, K., Scitovski, R., Vazler, I., 10.1007/s11590-011-0389-9, Optim. Lett. 7 (2013), 5-22. (2013) Zbl1283.90034MR3017090DOI10.1007/s11590-011-0389-9
- Sabo, K., Scitovski, R., Vazler, I., Zekić-Sušac, M., 10.1016/j.enconman.2010.10.037, Energy Conversion and Management 52 (2011), 1721-1727. (2011) DOI10.1016/j.enconman.2010.10.037
- Späth, H., Cluster-Formation und Analyse. Theorie, FORTRAN-Programme und Beispiele, R. Oldenburg Verlag München (1983). (1983) Zbl0536.62048
- Su, Z., Kogan, J., Second order conditions for -means clustering: Partitions vs. centroids, Text Mining 2008 Workshop (held in conjuction with the 8th SIAM International Conference on Data Mining) Atlanta (2008). (2008)
- Teboulle, M., A unified continuous optimization framework for center-based clustering methods, J. Mach. Learn. Res. 8 (2007), 65-102. (2007) Zbl1222.68318MR2280215
- Volkovich, V., Kogan, J., Nicholas, C., 10.1016/j.ejor.2005.12.045, Eur. J. Oper. Res. 183 (2007), 1097-1105. (2007) Zbl1135.90042MR2343742DOI10.1016/j.ejor.2005.12.045
- Wang, W., Zhang, Y., On fuzzy cluster validity indices, Fuzzy Sets Syst. 158 (2007), 2095-2117. (2007) Zbl1123.62046MR2405299
- Yang, X. S., 10.1007/978-3-642-04944-6_14, O. Watanabe et al. Stochastic Algorithms: Foundations and Applications 5th international symposium, SAGA 2009, Sapporo, Japan. Proceedings. Springer, Berlin. Lecture Notes in Computer Science (2009), 169-178. Zbl1260.90164MR2580252DOI10.1007/978-3-642-04944-6_14
Citations in EuDML Documents
topNotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.