Displaying similar documents to “Detecting a data set structure through the use of nonlinear projections search and optimization”

An alternative extension of the k-means algorithm for clustering categorical data

Ohn San, Van-Nam Huynh, Yoshiteru Nakamori (2004)

International Journal of Applied Mathematics and Computer Science

Similarity:

Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Recently, the problem of clustering categorical data has started drawing interest. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The -means algorithm is well known for its efficiency in this respect. At the same time, working...

A Comparative Analysis of Predictive Learning Algorithms on High-Dimensional Microarray Cancer Data

Bill, Jo, Fokoue, Ernest (2014)

Serdica Journal of Computing

Similarity:

This research evaluates pattern recognition techniques on a subclass of big data where the dimensionality of the input space (p) is much larger than the number of observations (n). Specifically, we evaluate massive gene expression microarray cancer data where the ratio κ is less than one. We explore the statistical and computational challenges inherent in these high dimensional low sample size (HDLSS) problems and present statistical machine learning methods used to tackle and circumvent...

An algorithm for reducing the dimension and size of a sample for data exploration procedures

Piotr Kulczycki, Szymon Łukasik (2014)

International Journal of Applied Mathematics and Computer Science

Similarity:

The paper deals with the issue of reducing the dimension and size of a data set (random sample) for exploratory data analysis procedures. The concept of the algorithm investigated here is based on linear transformation to a space of a smaller dimension, while retaining as much as possible the same distances between particular elements. Elements of the transformation matrix are computed using the metaheuristics of parallel fast simulated annealing. Moreover, elimination of or a decrease...

Data mining techniques using decision tree model in materialised projection and selection view.

Y. W. Teh (2004)

Mathware and Soft Computing

Similarity:

With the availability of very large data storage today, redundant data structures are no longer a big issue. However, an intelligent way of managing materialised projection and selection views that can lead to fast access of data is the central issue dealt with in this paper. A set of implementation steps for the data warehouse administrators or decision makers to improve the response time of queries is also defined. The study concludes that both attributes and tuples, are important...