Evolutionary learning of rich neural networks in the Bayesian model selection framework

Matteo Matteucci; Dario Spadoni

International Journal of Applied Mathematics and Computer Science (2004)

  • Volume: 14, Issue: 3, page 423-440
  • ISSN: 1641-876X

Abstract

top
In this paper we focus on the problem of using a genetic algorithm for model selection within a Bayesian framework. We propose to reduce the model selection problem to a search problem solved using evolutionary computation to explore a posterior distribution over the model space. As a case study, we introduce ELeaRNT (Evolutionary Learning of Rich Neural Network Topologies), a genetic algorithm which evolves a particular class of models, namely, Rich Neural Networks (RNN), in order to find an optimal domain-specific non-linear function approximator with a good generalization capability. In order to evolve this kind of neural networks, ELeaRNT uses a Bayesian fitness function. The experimental results prove that ELeaRNT using a Bayesian fitness function finds, in a completely automated way, networks well-matched to the analysed problem, with acceptable complexity.

How to cite

top

Matteucci, Matteo, and Spadoni, Dario. "Evolutionary learning of rich neural networks in the Bayesian model selection framework." International Journal of Applied Mathematics and Computer Science 14.3 (2004): 423-440. <http://eudml.org/doc/207708>.

@article{Matteucci2004,
abstract = {In this paper we focus on the problem of using a genetic algorithm for model selection within a Bayesian framework. We propose to reduce the model selection problem to a search problem solved using evolutionary computation to explore a posterior distribution over the model space. As a case study, we introduce ELeaRNT (Evolutionary Learning of Rich Neural Network Topologies), a genetic algorithm which evolves a particular class of models, namely, Rich Neural Networks (RNN), in order to find an optimal domain-specific non-linear function approximator with a good generalization capability. In order to evolve this kind of neural networks, ELeaRNT uses a Bayesian fitness function. The experimental results prove that ELeaRNT using a Bayesian fitness function finds, in a completely automated way, networks well-matched to the analysed problem, with acceptable complexity.},
author = {Matteucci, Matteo, Spadoni, Dario},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {Bayesian fitness; Bayesian model selection; genetic algorithms; Rich Neural Networks},
language = {eng},
number = {3},
pages = {423-440},
title = {Evolutionary learning of rich neural networks in the Bayesian model selection framework},
url = {http://eudml.org/doc/207708},
volume = {14},
year = {2004},
}

TY - JOUR
AU - Matteucci, Matteo
AU - Spadoni, Dario
TI - Evolutionary learning of rich neural networks in the Bayesian model selection framework
JO - International Journal of Applied Mathematics and Computer Science
PY - 2004
VL - 14
IS - 3
SP - 423
EP - 440
AB - In this paper we focus on the problem of using a genetic algorithm for model selection within a Bayesian framework. We propose to reduce the model selection problem to a search problem solved using evolutionary computation to explore a posterior distribution over the model space. As a case study, we introduce ELeaRNT (Evolutionary Learning of Rich Neural Network Topologies), a genetic algorithm which evolves a particular class of models, namely, Rich Neural Networks (RNN), in order to find an optimal domain-specific non-linear function approximator with a good generalization capability. In order to evolve this kind of neural networks, ELeaRNT uses a Bayesian fitness function. The experimental results prove that ELeaRNT using a Bayesian fitness function finds, in a completely automated way, networks well-matched to the analysed problem, with acceptable complexity.
LA - eng
KW - Bayesian fitness; Bayesian model selection; genetic algorithms; Rich Neural Networks
UR - http://eudml.org/doc/207708
ER -

References

top
  1. Angeline P.J. (1994): Genetic Programming and Emergent Intelligence, In: Advances in Genetic Programming (Jr. Kinnear and E. Kenneth, Eds.). - Cambridge, MA: MIT Press, pp. 75-98. 
  2. Bebis G., Georgiopoulos M. and Kasparis T. (1997): Coupling weight elimination with genetic algorithms to reduce network size and preserve generalization. - Neurocomput., Vol. 17, No. 3-4, pp. 167-194. 
  3. Bernardo J.M. and Smith A.F.M. (1994): Bayesian Theory. - New York: Wiley. 
  4. Bishop C.M. (1995): Neural Networks for Pattern Recognition. - Oxford: Oxford University Press. Zbl0868.68096
  5. Castellano G., Fanelli A.M. and Pelillo M. (1997): An iterative pruning algorithm for feedforward neural networks. - IEEE Trans. Neural Netw., Vol. 8, No. 3, pp. 519-531. 
  6. Chib S. and Greenberg E. (1995): Understanding the Metropolis-Hastings algorithm. -Amer. Stat., Vol. 49, No. 4, pp. 327-335. 
  7. Denison D.G.T., Holmes C.C., Mallick B.K. and Smith A.F.M. (2002): Bayesian Methods for Nonlinear Classification and Regression. - New York: Wiley. Zbl0994.62019
  8. Dudzinski M.L. and Mykytowycz R. (1961): The eye lens as an indicator of age in the wild rabbit in Australia. - CSIRO Wildlife Res., Vol. 6, No. 1, pp. 156-159. 
  9. Flake G.W. (1993): Nonmonotonic activation functions in multilayer perceptrons. - Ph.D. thesis, Dept. Comput. Sci., University of Maryland, College Park, MD. 
  10. Fletcher R. (1987): Practical Methods of Optimization. - New York: Wiley. Zbl0905.65002
  11. Goldberg D.E. (1989): Genetic Algorithms in Search, Optimization, and Machine Learning.Reading, MA: Addison-Wesley. Zbl0721.68056
  12. Gull S.F. (1989): Developments in maximum entropy data analysis, In: Maximum Entropy and Bayesian Methods, Cambridge 1998 (J. Skilling, Ed.). - Dordrecht: Kluwer, pp. 53-71. Zbl0701.62015
  13. Hancock P.J.B. (1992): Genetic algorithms and permutation problems: A comparison of recombination operators for neural net structure specification. - Proc. COGANN Workshop, Int. Joint Conf. Neural Networks, Piscataway, NJ, IEEE Computer Press, pp. 108-122. 
  14. Hashem S. (1997): Optimal linear combinations of neural networks. - Neural Netw., Vol 10, No. 4, pp. 599-614. 
  15. Hassibi B. and Stork D.G. (1992): Second order derivatives for network pruning: Optimal Brain Surgeon, In: Advances in Neural Information ProcessingSystems (S.J. Hanson, J.D. Cowan and C. Lee Giles, Eds.). -San Matteo, CA: Morgan Kaufmann, Vol. 5, pp. 164-171. 
  16. Hastings W.K. (1970): Monte Carlo sampling methods using Markov chains and their applications. - Biometrika, Vol. 57, pp. 97-109. Zbl0219.65008
  17. Haykin S. (1999): Neural Networks. A Comprehensive Foundation (2nd Edition). - New Jersey: Prentice Hall. Zbl0934.68076
  18. Hoeting J., Madigan D., Raftery A. and Volinsky C. (1998): Bayesian model averaging. - Tech. Rep. No. 9814, Department of Statistics, Colorado State University. Zbl1059.62525
  19. Hornik K.M., Stinchcombe M. and White H. (1989): Multilayer feedforward networks are universal approximators.- Neural Netw., Vol. 2, No. 5, pp. 359-366. 
  20. Liu Y. and Yao X. (1996): A population-based learning algorithm which learns both architectures and weights of neural networks.- Chinese J. Adv. Softw. Res., Vol. 3, No. 1, pp. 54-65. 
  21. Lovell D. and Tsoi A. (1992): The performance of the neocognitron with various s-cell and c-cell transfer functions. - Tech. Rep., Intelligent Machines Laboratory, Department of Electrical Engineering, University of Queensland. 
  22. MacKay D.J.C. (1992): A practical Bayesian framework for back propagation networks. - Neural Comput., Vol. 4, No. 3, pp. 448-472. 
  23. MacKay D.J.C. (1995): Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks. - Netw. Comput. Neural Syst., Vol. 6, No. 3, pp. 469-505. Zbl0834.68098
  24. MacKay D.J.C. (1999): Comparison of approximate methods for handling hyperparameters. - Neural Comput., Vol. 11, No. 5, pp. 1035-1068. 
  25. Mani G. (1990): Learning by gradient descent in function space. - Tech. Rep. No. WI 52703, Computer Sciences Department, University of Winsconsin, Madison, WI. 
  26. Matteucci M. (2002a): ELeaRNT: Evolutionary learning of rich neural network topologies. - Tech. Rep. No. CMU-CALD-02-103, Carnegie Mellon University, Pittsburgh, PA. 
  27. Matteucci M. (2002b): Evolutionary learning of adaptive models within a Bayesian framework. - Ph.D. thesis, Dipartimento di Elettronica e Informazione, Politecnico di Milano. 
  28. Montana D.J. and Davis L. (1989): Training feedforward neural networks using genetic algorithms. - Proc. 3rd Int. Conf. Genetic Algorithms, San Francisco, CA, USA, pp. 762-767. Zbl0709.68060
  29. Pearlmutter B.A. (1994): Fast exact multiplication by the Hessian. - Neural Comput., Vol. 6, No. 1, pp. 147-160. 
  30. Press W.H., Teukolsky S.A., Vetterling W.T. and Flannery B.P. (1992): Numerical Recipes in C: The Art of Scientific Computing.- Cambridge, UK: University Press. Zbl0845.65001
  31. Ronald E. and Schoenauer M. (1994): Genetic lander: An experiment in accurate neuro-genetic control. - Proc. 3rd Conf. Parallel Problem Solving from Nature, Berlin, Germany, pp. 452-461. 
  32. Rumelhart D.E., Hinton G.E. and Williams R.J. (1986): Learning representations by back-propagating errors. - Nature, Vol. 323, pp. 533-536. 
  33. Stone M. (1974): Cross-validation choice and assessment of statistical procedures. - J. Royal Stat. Soc., Series B, Vol. 36, pp. 111-147. Zbl0308.62063
  34. Tierney L. and Kadane J.B. (1986): Accurate approximations for posterior moments and marginal densities. - J. Amer. Stat. Assoc., Vol. 81, pp. 82-86. Zbl0587.62067
  35. Tikhonov A.N. (1963): Solution of incorrectly formulated problems and the regularization method. - Soviet Math. Dokl., Vol. 4, pp. 1035-1038. Zbl0141.11001
  36. Wasserman L. (1999): Bayesian model selection and model averaging. - J. Math. Psych., Vol. 44, No. 1, pp. 92-107. Zbl0946.62032
  37. Weigend A.S., Rumelhart D.E. and Huberman B.A. (1991): Generalization by weight elimination with application to forecasting, In: Advances in Neural Information Processing Systems, Vol. 3 (R. Lippmann, J. Moody and D. Touretzky, Eds.). - San Francisco, CA: Morgan-Kaufmann, pp. 875-882. 
  38. Williams P.M. (1995): Bayesian regularization and pruning using a Laplace prior. - Neural Comput., Vol. 7, No. 1, pp. 117-143. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.