Data mining methods for gene selection on the basis of gene expression arrays
Michał Muszyński, Stanisław Osowski (2014)
International Journal of Applied Mathematics and Computer Science
Similarity:
Michał Muszyński, Stanisław Osowski (2014)
International Journal of Applied Mathematics and Computer Science
Similarity:
Bill, Jo, Fokoue, Ernest (2014)
Serdica Journal of Computing
Similarity:
This research evaluates pattern recognition techniques on a subclass of big data where the dimensionality of the input space (p) is much larger than the number of observations (n). Specifically, we evaluate massive gene expression microarray cancer data where the ratio κ is less than one. We explore the statistical and computational challenges inherent in these high dimensional low sample size (HDLSS) problems and present statistical machine learning methods used to tackle and circumvent...
Roman Świniarski (2001)
International Journal of Applied Mathematics and Computer Science
Similarity:
The paper presents an application of rough sets and statistical methods to feature reduction and pattern recognition. The presented description of rough sets theory emphasizes the role of rough sets reducts in feature selection and data reduction in pattern recognition. The overview of methods of feature selection emphasizes feature selection criteria, including rough set-based methods. The paper also contains a description of the algorithm for feature selection and reduction based on...
Y. W. Teh (2004)
Mathware and Soft Computing
Similarity:
With the availability of very large data storage today, redundant data structures are no longer a big issue. However, an intelligent way of managing materialised projection and selection views that can lead to fast access of data is the central issue dealt with in this paper. A set of implementation steps for the data warehouse administrators or decision makers to improve the response time of queries is also defined. The study concludes that both attributes and tuples, are important...
Marek Zaremba (2010)
Control and Cybernetics
Similarity:
Robert P. W. Duin, Dick de Ridder, David M. J. Tax (1998)
Kybernetika
Similarity:
In this paper the possibilities are discussed for training statistical pattern recognizers based on a distance representation of the objects instead of a feature representation. Distances or similarities are used between the unknown objects to be classified with a selected subset of the training objects (the support objects). These distances are combined into linear or nonlinear classifiers. In this approach the feature definition problem is replaced by finding good similarity measures....
Petr Somol, Jiří Grim, Jana Novovičová, Pavel Pudil (2011)
Kybernetika
Similarity:
The purpose of feature selection in machine learning is at least two-fold - saving measurement acquisition costs and reducing the negative effects of the curse of dimensionality with the aim to improve the accuracy of the models and the classification rate of classifiers with respect to previously unknown data. Yet it has been shown recently that the process of feature selection itself can be negatively affected by the very same curse of dimensionality - feature selection methods may...
Krzysztof Fujarewicz, Małgorzata Wiench (2003)
International Journal of Applied Mathematics and Computer Science
Similarity:
DNA microarrays provide a new technique of measuring gene expression, which has attracted a lot of research interest in recent years. It was suggested that gene expression data from microarrays (biochips) can be employed in many biomedical areas, e.g., in cancer classification. Although several, new and existing, methods of classification were tested, a selection of proper (optimal) set of genes, the expressions of which can serve during classification, is still an open problem. Recently...
Fokoue, Ernest (2014)
Serdica Journal of Computing
Similarity:
Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to be analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of...