On numerical evaluation of maximum-likelihood estimates for finite mixtures of distributions
Recently a new interesting architecture of neural networks called “mixture of experts” has been proposed as a tool of real multivariate approximation or prediction. We show that the underlying problem is closely related to approximating the joint probability density of involved variables by finite mixture. Particularly, assuming normal mixtures, we can explicitly write the conditional expectation formula which can be interpreted as a mixture-of- experts network. In this way the related optimization...
We summarize the main results on probabilistic neural networks recently published in a series of papers. Considering the framework of statistical pattern recognition we assume approximation of class-conditional distributions by finite mixtures of product components. The probabilistic neurons correspond to mixture components and can be interpreted in neurophysiological terms. In this way we can find possible theoretical background of the functional properties of neurons. For example, the general...
Neural networks with radial basis functions are considered, and the Shannon information in their output concerning input. The role of information- preserving input transformations is discussed when the network is specified by the maximum information principle and by the maximum likelihood principle. A transformation is found which simplifies the input structure in the sense that it minimizes the entropy in the class of all information-preserving transformations. Such transformation need not be unique...
During the last decade we have introduced probabilistic mixture models into image modelling area, which present highly atypical and extremely demanding applications for these models. This difficulty arises from the necessity to model tens thousands correlated data simultaneously and to reliably learn such unusually complex mixture models. Presented paper surveys these novel generative colour image models based on multivariate discrete, Gaussian or Bernoulli mixtures, respectively and demonstrates...
The purpose of feature selection in machine learning is at least two-fold - saving measurement acquisition costs and reducing the negative effects of the curse of dimensionality with the aim to improve the accuracy of the models and the classification rate of classifiers with respect to previously unknown data. Yet it has been shown recently that the process of feature selection itself can be negatively affected by the very same curse of dimensionality - feature selection methods may easily over-fit...
Page 1