Center-based l₁-clustering method
International Journal of Applied Mathematics and Computer Science (2014)
- Volume: 24, Issue: 1, page 151-163
- ISSN: 1641-876X
Access Full Article
topAbstract
topHow to cite
topKristian Sabo. "Center-based l₁-clustering method." International Journal of Applied Mathematics and Computer Science 24.1 (2014): 151-163. <http://eudml.org/doc/271908>.
@article{KristianSabo2014,
abstract = {In this paper, we consider the l₁-clustering problem for a finite data-point set which should be partitioned into k disjoint nonempty subsets. In that case, the objective function does not have to be either convex or differentiable, and generally it may have many local or global minima. Therefore, it becomes a complex global optimization problem. A method of searching for a locally optimal solution is proposed in the paper, the convergence of the corresponding iterative process is proved and the corresponding algorithm is given. The method is illustrated by and compared with some other clustering methods, especially with the l₂-clustering method, which is also known in the literature as a smooth k-means method, on a few typical situations, such as the presence of outliers among the data and the clustering of incomplete data. Numerical experiments show in this case that the proposed l₁-clustering algorithm is faster and gives significantly better results than the l₂-clustering algorithm.},
author = {Kristian Sabo},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {l₁-clustering; data mining; optimization; weighted median problem; clustering},
language = {eng},
number = {1},
pages = {151-163},
title = {Center-based l₁-clustering method},
url = {http://eudml.org/doc/271908},
volume = {24},
year = {2014},
}
TY - JOUR
AU - Kristian Sabo
TI - Center-based l₁-clustering method
JO - International Journal of Applied Mathematics and Computer Science
PY - 2014
VL - 24
IS - 1
SP - 151
EP - 163
AB - In this paper, we consider the l₁-clustering problem for a finite data-point set which should be partitioned into k disjoint nonempty subsets. In that case, the objective function does not have to be either convex or differentiable, and generally it may have many local or global minima. Therefore, it becomes a complex global optimization problem. A method of searching for a locally optimal solution is proposed in the paper, the convergence of the corresponding iterative process is proved and the corresponding algorithm is given. The method is illustrated by and compared with some other clustering methods, especially with the l₂-clustering method, which is also known in the literature as a smooth k-means method, on a few typical situations, such as the presence of outliers among the data and the clustering of incomplete data. Numerical experiments show in this case that the proposed l₁-clustering algorithm is faster and gives significantly better results than the l₂-clustering algorithm.
LA - eng
KW - l₁-clustering; data mining; optimization; weighted median problem; clustering
UR - http://eudml.org/doc/271908
ER -
References
top- Angulo, J. and Serra, J. (2007). Modelling and segmentation of colour images in polar representations, Image and Vision Computing 25(4): 475-495.
- Äyrämö, S. (2006). Knowledge Mining Using Robust Clustering, Ph.D. thesis, University of Jyväskylä, Jyväskylä.
- Bagirov, A.M. and Ugon, J. (2005). An algorithm for minimizing clustering functions, Optimization 54(4-5): 351-368. Zbl1122.90059
- Bagirov, A.M., Ugon, J. and Webb, D. (2011). Fast modified global k-means algorithm for incremental cluster construction, Pattern Recognition 44(4): 886-876. Zbl1213.68514
- Bezdek, J.C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Kluwer Academic Publishers, Norwell, MA. Zbl0503.68069
- Boyd, D.L. and Vandenberghe, L. (2004). Convex Optimization, Cambridge University Press, Cambridge. Zbl1058.90049
- Chaovalitwongse, W.A., Butenko, S. and Pardalos, P.M., (Eds.) (2009). Clustering Challenges in Biological Networks, World Scientific, London.
- Choulakian, V. (2001). Robust q-mode principal component analysis in L₁, Computational Statistics & Data Analysis, 37(2): 135-150. Zbl1030.62050
- Clarke, F. H., (1990). Optimization and Nonsmooth Analysis, SIAM, Philadelphia, PA. Zbl0696.49002
- Cominetti, R. and Michelot, C. (1997). Sufficient conditions for coincidence in l₁-minisum multifacility location problems, Operations Research Letters 20(4): 179-185. Zbl0879.90131
- Cord, A., Ambroise, C. and Cocquerez, J.-P. (2006). Feature selection in robust clustering based on Laplace mixture, Pattern Recognition Letters 27(6): 627-635.
- Cupec, R., Grbić, R., Sabo, K. and Scitovski, R. (2009). Three points method for searching the best least absolute deviations plane, Applied Mathematics and Computation 215(3): 983-994. Zbl1176.65017
- Duda, R., Hart, P. and Stork, D. (2001). Pattern Classification, Wiley, New York, NY. Zbl0968.68140
- Finkel, D.E. and Kelley, C.T. (2006). Additive scaling and the DIRECT algorithm, Journal of Global Optimization 36(4): 597-608. Zbl1142.90488
- Floudas, C.A. and Gounaris, C.E. (2009). A review of recent advances in global optimization, Journal of Global Optimization 45(4): 3-38. Zbl1180.90245
- Frąckiewicz, M. and Palus, H. (2011). KHM clustering techique as a segmentation method for endoscopic colour images, International Journal of Applied Mathematics and Computer Science 21(1): 203-209, DOI: 10.2478/v10006-011-0015-0.
- Gan, G., Ma, C. and Wu, J. (2007). Data Clustering: Theory, Algorithms, and Applications, SIAM, Philadelphia, PA. Zbl1185.68274
- Grbić, R., Nyarko, E.K. and Scitovski, R. (2012). A modification of the direct method for Lipschitz global optimization for a symmetric function, Journal of Global Optimization, 57(4): 1193-1212, DOI: 10.1007/s10898-012-0020-3. Zbl1279.65076
- Grbić, R., Scitovski, K., Sabo, K. and Scitovski, R. (2013). Approximating surfaces by the moving least absolute deviations method, Applied Mathematics and Computation 219(9): 4387-4399. Zbl06447249
- Gurwitz, C. (1990). Weighted median algorithms for l₁ approximation, BIT 30(2): 301-310. Zbl0704.65044
- Hathaway, R.J. and Bezdek, J.C. (2001). Fuzzy c-means clustering of incomplete data, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 31(5): 735-744.
- Hubert, L. and Arabie, P. (1985). Comparing partitions, Journal of Classification 2(1): 193-218. Zbl0587.62128
- Jain, A. (2010). 50 years beyond k-means, Pattern Recognition Letters 31(8): 651-666.
- Jajuga, K. (1987). A clustering method based on the L₁-norm, Computational Statistics & Data Analysis 5(4): 357-371. Zbl0624.62058
- Jajuga, K. (1991). L₁-norm based fuzzy clustering, Fuzzy Sets and Systems 39(1): 43-50. Zbl0714.62052
- Iyigun, C. (2007). Probabilistic Distance Clustering, Ph.D. thesis, Graduate School, Rutgers, New Brunswick, NJ.
- Jones, D.R., Perttunen, C.D. and Stuckman, B.E. (1993). Lipschitzian optimization without the Lipschitz constant, Journal of Optimization Theory and Applications 79(1): 157-181. Zbl0796.49032
- Jörnsten, R. (2004). Clustering and classification based on the L₁ data depth, Journal of Multivariate Analysis 90(1): 67-89. Zbl1047.62064
- Kogan, J. (2007). Introduction to Clustering Large and High-Dimensional Data, Cambridge University Press, Cambridge. Zbl1183.62106
- Leisch, F. (2006). A toolbox for k-centroids cluster analysis, Computational Statistics & Data Analysis 51(2): 526-544. Zbl1157.62439
- Li, X. Hu, W., Wang, H. and Zhang, Z. (2010). Linear discriminant analysis using rotational invariant L₁ norm, Neurocomputing 73(13-15): 2571-2579.
- Scitovski, R. and Scitovski, S. (2013). A fast partitioning algorithm and its application to earthquake investigation, Computers and Geosciences 59(1): 124-131.
- Simiński, K. (2012). Neuro-rough-fuzzy approach for regression modelling from missing data, International Journal of Applied Mathematics and Computer Science 22(2): 461-476, DOI: 10.2478/v10006-012-0035-4. Zbl1283.93165
- Späth, H. (1976). L₁-cluster analysis, Computing 16(4): 379-387. Zbl0322.65008
- Späth, H. (1987). Using the L₁-norm within cluster analysis, in Y. Dodge (Ed.), Proceedings of the First International Conference on Statistical Data Analysis Based on the L₁-Norm and Related Methods, University of Neuchatel/Switzerland, August 31-September 04, 1987, Elsevier, Amsterdam, pp. 427-434.
- Malinen, M.I. and Fränti, P. (2012). Clustering by analytic functions, Information Sciences 217(1): 31-38.
- Meng, D., Zhao, Q and Xu, Z. (2012). Improve robustness of sparse PCA by L₁-norm maximization, Pattern Recognition 45(1): 487-497. Zbl1225.68202
- Pintér, J.D. (1996). Global Optimization in Action (Continuous and Lipschitz Optimization: Algorithms, Implementations and Applications), Kluwer Academic Publishers, Dordrecht. Zbl0842.90110
- Ruszczynski, A (2006). Nonlinear Optimization, Princeton University Press, Princeton/Oxford, NJ. Zbl1108.90001
- Sabo, K. and Scitovski, R. (2008). The best least absolute deviations line-properties and two efficient methods, ANZIAM Journal 50(2): 185-198. Zbl1182.65023
- Sabo, K., Scitovski, R. and Vazler, I. (2011). Searching for a best LAD-solution of an overdetermined system of linear equations motivated by searching for a best LAD-hyperplane on the basis of given data, Journal of Optimization Theory and Applications 149(2): 293-314. Zbl1219.90125
- Sabo, K., Scitovski, R. and Vazler, I. (2012). One-dimensional center-based l₁-clustering method, Optimization Letters 7(1): 5-22 Zbl1283.90034
- Sabo, K., Scitovski, R., Vazler, I. and Zekić-Sušac, M. (2011). Mathematical models of natural gas consumption, Energy Conversion and Management 52(3): 1721-1727.
- Teboulle, M. (2007). A unified continuous optimization framework for center-based clustering methods, Journal of Machine Learning Research 8(1): 65-102. Zbl1222.68318
- Vardi, Y., Zhang, C. H. (2000). The multivariate L₁-median and associated data depth, Proceedings of the National Academy of Sciences, United States of America 97(4): 1423-1426. Zbl1054.62067
- Vazler, I., Sabo, K. and Scitovski, R. (2012). Weighted median of the data in solving least absolute deviations problems, Communications in Statistics-Theory and Methods 41(8): 1455-1465. Zbl1319.62141
- Zhang, J., Peng, L., Zhao, X. and Kuruoglu E.E. (2012). Robust data clustering by learning multi-metric -norm distances, Expert Systems with Applications 39(1): 335-349.
Citations in EuDML Documents
top- Tomasz Zok, Maciej Antczak, Martin Riedel, David Nebel, Thomas Villmann, Piotr Lukasiak, Jacek Blazewicz, Marta Szachniuk, Building the library of RNA 3D nucleotide conformations using the clustering approach
- Michał Muszyński, Stanisław Osowski, Data mining methods for gene selection on the basis of gene expression arrays
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.