Mathematical programming approaches to classification problems.
En este trabajo definimos una medida de centralización multidimensional para vectores aleatorios como el valor del parámetro para el que se alcanza el mínimo de las integrales de ciertas funciones. Estudiamos su relación con otras medidas de centralización multidimensionales conocidas. Finalizamos demostrando la Ley Fuerte de los Grandes Números, tanto para la medida de centralización definida como para la de dispersión asociada.
In this paper, the problem of inference with misclassified multinomial data is addressed. Over the last years there has been a significant upsurge of interest in the development of Bayesian methods to make inferences with misclassified data. The wide range of applications for several sampling schemes and the importance of including initial information make Bayesian analysis an essential tool to be used in this context. A review of the existing literature followed by a methodological discussion is...
Point estimators based on minimization of information-theoretic divergences between empirical and hypothetical distribution induce a problem when working with continuous families which are measure-theoretically orthogonal with the family of empirical distributions. In this case, the -divergence is always equal to its upper bound, and the minimum -divergence estimates are trivial. Broniatowski and Vajda [3] proposed several modifications of the minimum divergence rule to provide a solution to the...
A framework for multi-label classification extended by Error Correcting Output Codes (ECOCs) is introduced and empirically examined in the article. The solution assumes the base multi-label classifiers to be a noisy channel and applies ECOCs in order to recover the classification errors made by individual classifiers. The framework was examined through exhaustive studies over combinations of three distinct classification algorithms and four ECOC methods employed in the multi-label classification...
The paper presents a new system for ECG (ElectroCardioGraphy) signal recognition using different neural classifiers and a binary decision tree to provide one more processing stage to give the final recognition result. As the base classifiers, the three classical neural models, i.e., the MLP (Multi Layer Perceptron), modified TSK (Takagi-Sugeno-Kang) and the SVM (Support Vector Machine), will be applied. The coefficients in ECG signal decomposition using Hermite basis functions and the peak-to-peak...
Let X be a random element in a metric space (F,d), and let Y be a random variable with value 0 or 1. Y is called the class, or the label, of X. Let (Xi,Yi)1 ≤ i ≤ n be an observed i.i.d. sample having the same law as (X,Y). The problem of classification is to predict the label of a new random element X. The k-nearest neighbor classifier is the simple following rule: look at the k nearest neighbors of X in the trial sample and choose 0 or 1 for its label according to the majority vote. When , Stone...
For general Bayes decision rules there are considered perceptron approximations based on sufficient statistics inputs. A particular attention is paid to Bayes discrimination and classification. In the case of exponentially distributed data with known model it is shown that a perceptron with one hidden layer is sufficient and the learning is restricted to synaptic weights of the output neuron. If only the dimension of the exponential model is known, then the number of hidden layers will increase...
We summarize the main results on probabilistic neural networks recently published in a series of papers. Considering the framework of statistical pattern recognition we assume approximation of class-conditional distributions by finite mixtures of product components. The probabilistic neurons correspond to mixture components and can be interpreted in neurophysiological terms. In this way we can find possible theoretical background of the functional properties of neurons. For example, the general...
The problem of missing data is particularly present in archaeological research where, because of the fragmentariness of the finds, only a part of the characteristics of the whole object can be observed. The performance of various dissimilarity indices differently weighting missing values is studied on archaeological data via a simulation. An alternative solution consisting in randomly substituting missing values with character sets is also examined. Gower's dissimilarity coefficient seems to be...
The paper gives an overview of feature selection techniques in statistical pattern recognition with particular emphasis on methods developed within the Institute of Information Theory and Automation research team throughout recent years. Besides discussing the advances in methodology since times of Perez’s pioneering work the paper attempts to put the methods into a taxonomical framework. The methods discussed include the latest variants of the optimal algorithms, enhanced sub-optimal techniques...
Numerical taxonomy, which uses numerical methods to classify and relate items whose properties are non-numerical, is suggested as both an advantageous tool to support case-based reasoning and a means for agents to exploit knowledge that is best expressed in cases. The basic features of numerical taxonomy are explained, and discussed in application to a problem where human agents with differing views obtain solutions by negotiation and by reference to knowledge that is essentially case-like: allocation...
In this paper we study the main properties of a distance introduced by C.M. Cuadras (1974). This distance is a generalization of the well-known Mahalanobis distance between populations to a distance between parametric estimable functions inside the multivariate analysis of variance model. Reduction of dimension properties, invariant properties under linear automorphisms, estimation of the distance, distribution under normality as well as the interpretation as a geodesic distance are studied and...