About the maximum information and maximum likelihood principles

Igor Vajda; Jiří Grim

Kybernetika (1998)

  • Volume: 34, Issue: 4, page [485]-494
  • ISSN: 0023-5954

Abstract

top
Neural networks with radial basis functions are considered, and the Shannon information in their output concerning input. The role of information- preserving input transformations is discussed when the network is specified by the maximum information principle and by the maximum likelihood principle. A transformation is found which simplifies the input structure in the sense that it minimizes the entropy in the class of all information-preserving transformations. Such transformation need not be unique - under some assumptions it may be any minimal sufficient statistics.

How to cite

top

Vajda, Igor, and Grim, Jiří. "About the maximum information and maximum likelihood principles." Kybernetika 34.4 (1998): [485]-494. <http://eudml.org/doc/33382>.

@article{Vajda1998,
abstract = {Neural networks with radial basis functions are considered, and the Shannon information in their output concerning input. The role of information- preserving input transformations is discussed when the network is specified by the maximum information principle and by the maximum likelihood principle. A transformation is found which simplifies the input structure in the sense that it minimizes the entropy in the class of all information-preserving transformations. Such transformation need not be unique - under some assumptions it may be any minimal sufficient statistics.},
author = {Vajda, Igor, Grim, Jiří},
journal = {Kybernetika},
keywords = {neural networks; radial basis functions; entropy minimization; neural networks; radial basis functions; entropy minimization},
language = {eng},
number = {4},
pages = {[485]-494},
publisher = {Institute of Information Theory and Automation AS CR},
title = {About the maximum information and maximum likelihood principles},
url = {http://eudml.org/doc/33382},
volume = {34},
year = {1998},
}

TY - JOUR
AU - Vajda, Igor
AU - Grim, Jiří
TI - About the maximum information and maximum likelihood principles
JO - Kybernetika
PY - 1998
PB - Institute of Information Theory and Automation AS CR
VL - 34
IS - 4
SP - [485]
EP - 494
AB - Neural networks with radial basis functions are considered, and the Shannon information in their output concerning input. The role of information- preserving input transformations is discussed when the network is specified by the maximum information principle and by the maximum likelihood principle. A transformation is found which simplifies the input structure in the sense that it minimizes the entropy in the class of all information-preserving transformations. Such transformation need not be unique - under some assumptions it may be any minimal sufficient statistics.
LA - eng
KW - neural networks; radial basis functions; entropy minimization; neural networks; radial basis functions; entropy minimization
UR - http://eudml.org/doc/33382
ER -

References

top
  1. Atick J. J., Redlich A. N., 10.1162/neco.1990.2.3.308, Neural Computation 2 (1990), 308–320 (1990) DOI10.1162/neco.1990.2.3.308
  2. Attneave F., 10.1037/h0054663, Psychological Review 61 (1954), 183–193 (1954) DOI10.1037/h0054663
  3. Becker S., Hinton G. E., 10.1038/355161a0, Nature (London) 355 (1992), 161–163 (1992) DOI10.1038/355161a0
  4. Bromhead D. S., Lowe D., Multivariate functional interpolation and adaptive networks, Complex Systems 2 (1988), 321–355 (1988) MR0955557
  5. Casdagli M., Nonlinear prediction of chaotic time–series, Physica 35D (1989), 335–356 (1989) Zbl0671.62099MR1004201
  6. Cover T. M., Thomas J. B., Elements of Information Theory, Wiley, New York 1991 Zbl1140.94001MR1122806
  7. Dempster A. P., Laird N. M., Rubin D. B., Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Statist. Soc. Ser. B 39 (1977), 1–38 (1977) Zbl0364.62022MR0501537
  8. Devroye L., Győrfi L., Nonparametric Density Estimation: The L 1 View, John Wiley, New York 1985 MR0780746
  9. Devroye L., Győrfi L., Lugosi G., A Probabilistic Theory of Pattern Recognition, Springer, New York 1996 MR1383093
  10. Haykin S., Neural Networks: A Comprehensive Foundation, MacMillan, New York 1994 Zbl0934.68076
  11. Hertz J., Krogh A., Palmer R. G., Introduction to the Theory of Neural Computation, Addison–Wesley, New York, Menlo Park CA, Amsterdam 1991 MR1096298
  12. Jacobs R. A., Jordan M. I., A competitive modular connectionist architecture, In: Advances in Neural Information Processing Systems (R. P. Lippmann, J. E. Moody and D. J. Touretzky, eds.), Morgan Kaufman, San Mateo CA 1991, Vol. 3. pp. 767–773 (1991) 
  13. Kay J., Feature discovery under contextual supervision using mutual information, In: International Joint Conference on Neural Networks, Baltimore MD 1992, Vol. 4, pp. 79–84 (1992) 
  14. Liese F., Vajda I., Convex Statistical Distances, Teubner Verlag, Leipzig 1987 Zbl0656.62004MR0926905
  15. Linsker R., 10.1109/2.36, Computer 21 (1988), 105–117 (1988) DOI10.1109/2.36
  16. Linsker R., 10.1146/annurev.ne.13.030190.001353, Annual Review of Neuroscience 13 (1990), 257–281 (1990) DOI10.1146/annurev.ne.13.030190.001353
  17. Lowe D., Adaptive radial basis function nonlinearities, and the problem of generalization, In: First IEE International Conference on Artificial Neural Networks, 1989, pp. 95–99 (1989) 
  18. Moody J., Darken C., 10.1162/neco.1989.1.2.281, Neural Computation 1 (1989), 281–294 (1989) DOI10.1162/neco.1989.1.2.281
  19. Palm H. CH., A new method for generating statistical classifiers assuming linear mixtures of Gaussiian densities, In: Proceedings of the 12th IAPR Int. Conference on Pattern Recognition, IEEE Computer Society Press Jerusalem 1994, Vol. II., pp. 483–486 (1994) 
  20. Plumbley M. D., A Hebbian/anti–Hebbian network which optimizes information capacity by orthonormalizing the principle subspace, In: IEE Artificial Neural Networks Conference, ANN-93, Brighton 1992, pp. 86–90 (1992) 
  21. Plumbley M. D., Fallside F., An information–theoretic approach to unsupervised connectionist models, In: Proceedings of the 1988 Connectionist Models Summer School, (D. Touretzky, G. Hinton and T. Sejnowski, eds.), Morgan Kaufmann, San Mateo 1988, pp. 239–245 (1988) 
  22. Poggio T., Girosi F., 10.1126/science.247.4945.978, Science 247 (1990), 978–982 (1990) MR1038271DOI10.1126/science.247.4945.978
  23. Rissanen J., Stochastic Complexity in Statistical Inquiry, World Scientific, New Jersey 1989 Zbl0800.68508MR1082556
  24. Specht D. F., Probabilistic neural networks for classification, mapping or associative memory, In: Proc. of the IEEE Int. Conference on Neural Networks, 1988, Vol. I., pp. 525–532 (1988) 
  25. Shannon C. E., 10.1002/j.1538-7305.1948.tb01338.x, Bell System Technical Journal 27 (1948), 379–423, 623–656 (1948) Zbl1154.94303MR0026286DOI10.1002/j.1538-7305.1948.tb01338.x
  26. Streit L. R., Luginbuhl T. E., 10.1109/72.317728, IEEE Trans. Neural Networks 5 (1994), 5, 764–783 (1994) DOI10.1109/72.317728
  27. Vajda I., Grim J., Bayesian optimality of decisions is achievable by RBF neural networks, IEEE Trans. Neural Networks, submitted 
  28. Ukrainec A., Haykin S., 10.1016/0893-6080(95)00062-3, Neural Networks 9 (1996), 141–168 (1996) DOI10.1016/0893-6080(95)00062-3
  29. Uttley A. M., The transmission of information and the effect of local feedback in theoretical and neural networks, Brain Research 102 (1966), 23–35 (1966) 
  30. Watanabe S., Fukumizu K., 10.1109/72.377974, IEEE Trans. Neural Networks 6 (1995), 3, 691–702 (1995) DOI10.1109/72.377974
  31. Xu L., Jordan M. I., EM learning on a generalized finite mixture model for combining multiple classifiers, In: World Congress on Neural Networks, 1993, Vol. 4, pp. 227–230 (1993) 
  32. Xu L., Krzyżak A., Oja E., 10.1109/72.238318, IEEE Trans. Neural Networks 4 (1993), 636–649 (1993) DOI10.1109/72.238318

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.