The search session has expired. Please query the service again.
A general probabilistic model for describing the structure of statistical problems known under the generic name of cluster analysis, based on finite mixtures of distributions, is proposed. We analyse the theoretical and practical implications of this approach, and point out some open question on both the theoretical problem of determining the reference prior for models based on mixtures, and the practical problem of approximation that mixtures typically entail. Finally, models based on mixtures...
We study the integration of a copula with respect to the probability measure generated by another copula. To this end, we consider the map [. , .] : C × C → R given by [...] where C denotes the collection of all d–dimensional copulas and QD denotes the probability measures associated with the copula D. Specifically, this is of interest since several measures of concordance such as Kendall’s tau, Spearman’s rho and Gini’s gamma can be expressed in terms of the map [. , .]. Quite generally, the map...
We present a test for identifying clusters in high dimensional
data based on the k-means algorithm when the null hypothesis is spherical
normal. We show that projection techniques used for evaluating validity of
clusters may be misleading for such data. In particular, we demonstrate
that increasingly well-separated clusters are identified as the dimensionality
increases, when no such clusters exist. Furthermore, in a case of true
bimodality, increasing the dimensionality makes identifying the correct...
In this note, we propose a general definition of shape which is both compatible with the one proposed in phenomenology (gestaltism) and with a computer vision implementation. We reverse the usual order in Computer Vision. We do not define “shape recognition” as a task which requires a “model” pattern which is searched in all images of a certain kind. We give instead a “blind” definition of shapes relying only on invariance and repetition arguments. Given a set of images , we call shape of this...
In this note, we propose a general definition of shape which is
both compatible with the one proposed in phenomenology
(gestaltism) and with a computer vision implementation. We reverse
the usual order in Computer Vision. We do not define “shape
recognition" as a task which requires a “model" pattern which is
searched in all images of a certain kind. We give instead a
“blind" definition of shapes relying
only on invariance and repetition arguments.
Given a set of images , we call shape of this...
Microaggregation is a statistical disclosure control technique for microdata. Raw microdata (i.e. individual records) are grouped into small aggregates prior to publication. Each aggregate should contain at least k records to prevent disclosure of individual information. Fixed-size microaggregation consists of taking fixed-size microaggregates (size k). Data-oriented microaggregation (with variable group size) was introduced recently. Regardless of the group size, microaggregations on a multidimensional...
Several counterparts of Bayesian networks based on different paradigms have been proposed in evidence theory. Nevertheless, none of them is completely satisfactory. In this paper we will present a new one, based on a recently introduced concept of conditional independence. We define a conditioning rule for variables, and the relationship between conditional independence and irrelevance is studied with the aim of constructing a Bayesian-network-like model. Then, through a simple example, we will...
The aim of this paper is to provide a gradient clustering algorithm in its complete form, suitable for direct use without requiring a deeper statistical knowledge. The values of all parameters are effectively calculated using optimizing procedures. Moreover, an illustrative analysis of the meaning of particular parameters is shown, followed by the effects resulting from possible modifications with respect to their primarily assigned optimal values. The proposed algorithm does not demand strict assumptions...
We introduce and discuss the test space problem as a part of the whole copula fitting process. In particular, we explain how an efficient copula test space can be constructed by taking into account information about the existing dependence, and we present a complete overview of bivariate test spaces for all possible situations. The practical use will be illustrated by means of a numerical application based on an illustrative portfolio containing the S&P 500 Composite Index, the JP Morgan Government...
We propose a new nonparametric procedure to solve the problem of classifying objects represented by -dimensional vectors into groups. The newly proposed classifier was inspired by the nearest neighbour (kNN) method. It is based on the idea of a depth-based distributional neighbourhood and is called nearest depth neighbours (kNDN) classifier. The kNDN classifier has several desirable properties: in contrast to the classical kNN, it can utilize global properties of the considered distributions...
Nowadays, multiclassifier systems (MCSs) are being widely applied in various machine learning problems and in many different domains. Over the last two decades, a variety of ensemble systems have been developed, but there is still room for improvement. This paper focuses on developing competence and interclass cross-competence measures which can be applied as a method for classifiers combination. The cross-competence measure allows an ensemble to harness pieces of information obtained from incompetent...
In a multivariate normal distribution, let the inverse of the covariance matrix be a band matrix. The distribution of the sufficient statistic for the covariance matrix is derived for this case. It is a generalization of the Wishart distribution. The distribution may be used for unbiased density estimation and construction of classification rules.
Currently displaying 1 –
20 of
169