Improving Categorical Data Clustering Algorithm by Weighting Uncommon Attribute Value Matches
Zengyou He, Xiaofei Xu, Shenchun Deng (2006)
Computer Science and Information Systems
Similarity:
Zengyou He, Xiaofei Xu, Shenchun Deng (2006)
Computer Science and Information Systems
Similarity:
Urszula Boryczka (2010)
Control and Cybernetics
Similarity:
Piotr Kulczycki, Szymon Łukasik (2014)
International Journal of Applied Mathematics and Computer Science
Similarity:
Anna Bartkowiak (1975)
Applicationes Mathematicae
Similarity:
Zou, Wenping, Zhu, Yunlong, Chen, Hanning, Sui, Xin (2010)
Discrete Dynamics in Nature and Society
Similarity:
Ohn San, Van-Nam Huynh, Yoshiteru Nakamori (2004)
International Journal of Applied Mathematics and Computer Science
Similarity:
Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Recently, the problem of clustering categorical data has started drawing interest. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The -means algorithm is well known for its efficiency in this respect. At the same time, working...
Zengyou He, Xiaofei Xu, Joshua Zhexue Huang, Shengchun Deng (2005)
Computer Science and Information Systems
Similarity:
Ireneusz Czarnowski, Piotr Jędrzejowicz (2011)
International Journal of Applied Mathematics and Computer Science
Similarity:
The problem considered concerns data reduction for machine learning. Data reduction aims at deciding which features and instances from the training set should be retained for further use during the learning process. Data reduction results in increased capabilities and generalization properties of the learning model and a shorter time of the learning process. It can also help in scaling up to large data sources. The paper proposes an agent-based data reduction approach with the learning...
Anna Bartkowiak (1988)
Applicationes Mathematicae
Similarity:
Marek Tabedzki, Khalid Saeed, Adam Szczepański (2016)
International Journal of Applied Mathematics and Computer Science
Similarity:
The K3M thinning algorithm is a general method for image data reduction by skeletonization. It had proved its feasibility in most cases as a reliable and robust solution in typical applications of thinning, particularly in preprocessing for optical character recognition. However, the algorithm had still some weak points. Since then K3M has been revised, addressing the best known drawbacks. This paper presents a modified version of the algorithm. A comparison is made with the original...
Thomas Berry, Somasundaram Ravindran (2002)
Kybernetika
Similarity:
In this paper we present experimental results for string matching algorithms which have a competitive theoretical worst case run time complexity. Of these algorithms a few are already famous for their speed in practice, such as the Boyer–Moore and its derivatives. We chose to evaluate the algorithms by counting the number of comparisons made and by timing how long they took to complete a given search. Using the experimental results we were able to introduce a new string matching algorithm...