Page 1 Next

Displaying 1 – 20 of 257

Showing per page

A Bayesian approach to cluster analysis.

José M. Bernardo, F.Javier Girón (1988)

Qüestiió

A general probabilistic model for describing the structure of statistical problems known under the generic name of cluster analysis, based on finite mixtures of distributions, is proposed. We analyse the theoretical and practical implications of this approach, and point out some open question on both the theoretical problem of determining the reference prior for models based on mixtures, and the practical problem of approximation that mixtures typically entail. Finally, models based on mixtures...

A Bimodality Test in High Dimensions

Palejev, Dean (2012)

Serdica Journal of Computing

We present a test for identifying clusters in high dimensional data based on the k-means algorithm when the null hypothesis is spherical normal. We show that projection techniques used for evaluating validity of clusters may be misleading for such data. In particular, we demonstrate that increasingly well-separated clusters are identified as the dimensionality increases, when no such clusters exist. Furthermore, in a case of true bimodality, increasing the dimensionality makes identifying the correct...

A comparative study of microaggregation methods.

Josep Maria Mateo Sanz, Josep Domingo Ferrer (1998)

Qüestiió

Microaggregation is a statistical disclosure control technique for microdata. Raw microdata (i.e. individual records) are grouped into small aggregates prior to publication. Each aggregate should contain at least k records to prevent disclosure of individual information. Fixed-size microaggregation consists of taking fixed-size microaggregates (size k). Data-oriented microaggregation (with variable group size) was introduced recently. Regardless of the group size, microaggregations on a multidimensional...

A complete gradient clustering algorithm formed with kernel estimators

Piotr Kulczycki, Małgorzata Charytanowicz (2010)

International Journal of Applied Mathematics and Computer Science

The aim of this paper is to provide a gradient clustering algorithm in its complete form, suitable for direct use without requiring a deeper statistical knowledge. The values of all parameters are effectively calculated using optimizing procedures. Moreover, an illustrative analysis of the meaning of particular parameters is shown, followed by the effects resulting from possible modifications with respect to their primarily assigned optimal values. The proposed algorithm does not demand strict assumptions...

A depth-based modification of the k-nearest neighbour method

Ondřej Vencálek, Daniel Hlubinka (2021)

Kybernetika

We propose a new nonparametric procedure to solve the problem of classifying objects represented by d -dimensional vectors into K 2 groups. The newly proposed classifier was inspired by the k nearest neighbour (kNN) method. It is based on the idea of a depth-based distributional neighbourhood and is called k nearest depth neighbours (kNDN) classifier. The kNDN classifier has several desirable properties: in contrast to the classical kNN, it can utilize global properties of the considered distributions...

A dynamic model of classifier competence based on the local fuzzy confusion matrix and the random reference classifier

Pawel Trajdos, Marek Kurzynski (2016)

International Journal of Applied Mathematics and Computer Science

Nowadays, multiclassifier systems (MCSs) are being widely applied in various machine learning problems and in many different domains. Over the last two decades, a variety of ensemble systems have been developed, but there is still room for improvement. This paper focuses on developing competence and interclass cross-competence measures which can be applied as a method for classifiers combination. The cross-competence measure allows an ensemble to harness pieces of information obtained from incompetent...

A Global Approach to the Comparison of Clustering Results

Osvaldo Silva, Helena Bacelar-Nicolau, Fernando C. Nicolau (2012)

Biometrical Letters

The discovery of knowledge in the case of Hierarchical Cluster Analysis (HCA) depends on many factors, such as the clustering algorithms applied and the strategies developed in the initial stage of Cluster Analysis. We present a global approach for evaluating the quality of clustering results and making a comparison among different clustering algorithms using the relevant information available (e.g. the stability, isolation and homogeneity of the clusters). In addition, we present a visual method...

A graph-based estimator of the number of clusters

Gérard Biau, Benoît Cadre, Bruno Pelletier (2007)

ESAIM: Probability and Statistics

Assessing the number of clusters of a statistical population is one of the essential issues of unsupervised learning. Given n independent observations X1,...,Xn drawn from an unknown multivariate probability density f, we propose a new approach to estimate the number of connected components, or clusters, of the t-level set ( t ) = { x : f ( x ) t } . The basic idea is to form a rough skeleton of the set ( t ) using any preliminary estimator of f, and to count the number of connected components of the resulting graph. Under...

A learning algorithm combining functional discriminant coordinates and functional principal components

Tomasz Górecki, Mirosław Krzyśko (2014)

Discussiones Mathematicae Probability and Statistics

A new type of discriminant space for functional data is presented, combining the advantages of a functional discriminant coordinate space and a functional principal component space. In order to provide a comprehensive comparison, we conducted a set of experiments, testing effectiveness on 35 functional data sets (time series). Experiments show that constructed combined space provides a higher quality of classification of LDA method compared with component spaces.

A multi-agent brokerage platform for media content recommendation

Bruno Veloso, Benedita Malheiro, Juan Carlos Burguillo (2015)

International Journal of Applied Mathematics and Computer Science

Near real time media content personalisation is nowadays a major challenge involving media content sources, distributors and viewers. This paper describes an approach to seamless recommendation, negotiation and transaction of personalised media content. It adopts an integrated view of the problem by proposing, on the business-to-business (B2B) side, a brokerage platform to negotiate the media items on behalf of the media content distributors and sources, providing viewers, on the business-to-consumer...

A non asymptotic penalized criterion for gaussian mixture model selection

Cathy Maugis, Bertrand Michel (2011)

ESAIM: Probability and Statistics

Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures...

A non asymptotic penalized criterion for Gaussian mixture model selection

Cathy Maugis, Bertrand Michel (2012)

ESAIM: Probability and Statistics

Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures...

A note on the computational complexity of hierarchical overlapping clustering

Mirko Křivánek (1985)

Aplikace matematiky

In this paper the computational complexity of the problem of the approximation of a given dissimilarity measure on a finite set X by a k -ultrametric on X and by a Robinson dissimilarity measure on X is investigared. It is shown that the underlying decision problems are NP-complete.

A practical application of kernel-based fuzzy discriminant analysis

Jian-Qiang Gao, Li-Ya Fan, Li Li, Li-Zhong Xu (2013)

International Journal of Applied Mathematics and Computer Science

A novel method for feature extraction and recognition called Kernel Fuzzy Discriminant Analysis (KFDA) is proposed in this paper to deal with recognition problems, e.g., for images. The KFDA method is obtained by combining the advantages of fuzzy methods and a kernel trick. Based on the orthogonal-triangular decomposition of a matrix and Singular Value Decomposition (SVD), two different variants, KFDA/QR and KFDA/SVD, of KFDA are obtained. In the proposed method, the membership degree is incorporated...

Currently displaying 1 – 20 of 257

Page 1 Next