Notes on the evolution of feature selection methodology

Petr Somol; Jana Novovičová; Pavel Pudil

Kybernetika (2007)

  • Volume: 43, Issue: 5, page 713-730
  • ISSN: 0023-5954

Abstract

top
The paper gives an overview of feature selection techniques in statistical pattern recognition with particular emphasis on methods developed within the Institute of Information Theory and Automation research team throughout recent years. Besides discussing the advances in methodology since times of Perez’s pioneering work the paper attempts to put the methods into a taxonomical framework. The methods discussed include the latest variants of the optimal algorithms, enhanced sub-optimal techniques and the simultaneous semi- parametric probability density function modelling and feature space selection method. Some related issues are illustrated on real data by means of the Feature Selection Toolbox software.

How to cite

top

Somol, Petr, Novovičová, Jana, and Pudil, Pavel. "Notes on the evolution of feature selection methodology." Kybernetika 43.5 (2007): 713-730. <http://eudml.org/doc/33890>.

@article{Somol2007,
abstract = {The paper gives an overview of feature selection techniques in statistical pattern recognition with particular emphasis on methods developed within the Institute of Information Theory and Automation research team throughout recent years. Besides discussing the advances in methodology since times of Perez’s pioneering work the paper attempts to put the methods into a taxonomical framework. The methods discussed include the latest variants of the optimal algorithms, enhanced sub-optimal techniques and the simultaneous semi- parametric probability density function modelling and feature space selection method. Some related issues are illustrated on real data by means of the Feature Selection Toolbox software.},
author = {Somol, Petr, Novovičová, Jana, Pudil, Pavel},
journal = {Kybernetika},
keywords = {feature selection; branch & bound; sequential search; mixture model; branch and bound; sequential search; mixture model},
language = {eng},
number = {5},
pages = {713-730},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Notes on the evolution of feature selection methodology},
url = {http://eudml.org/doc/33890},
volume = {43},
year = {2007},
}

TY - JOUR
AU - Somol, Petr
AU - Novovičová, Jana
AU - Pudil, Pavel
TI - Notes on the evolution of feature selection methodology
JO - Kybernetika
PY - 2007
PB - Institute of Information Theory and Automation AS CR
VL - 43
IS - 5
SP - 713
EP - 730
AB - The paper gives an overview of feature selection techniques in statistical pattern recognition with particular emphasis on methods developed within the Institute of Information Theory and Automation research team throughout recent years. Besides discussing the advances in methodology since times of Perez’s pioneering work the paper attempts to put the methods into a taxonomical framework. The methods discussed include the latest variants of the optimal algorithms, enhanced sub-optimal techniques and the simultaneous semi- parametric probability density function modelling and feature space selection method. Some related issues are illustrated on real data by means of the Feature Selection Toolbox software.
LA - eng
KW - feature selection; branch & bound; sequential search; mixture model; branch and bound; sequential search; mixture model
UR - http://eudml.org/doc/33890
ER -

References

top
  1. Das S., Filters, wrappers and a boosting-based hybrid for feature selection, In: Proc. 18th Internat. Conference Machine Learning, 2001, pp. 74–81 
  2. Dash M., Choi K., Scheuermann, P., Liu H., Feature selection for clustering – a Filter solution, In: Proc. Second Internat. Conference Data Mining, 2002, pp. 15–122 
  3. Devijver P. A., Kittler J., Pattern Recognition: A Statistical Approach, Prentice-Hall, Englewood Cliffs, NJ 1982 Zbl0542.68071MR0692767
  4. Ferri F. J., Pudil P., Hatef, M., Kittler J., Comparative study of techniques for large-scale feature selection, In: Pattern Recognition in Practice IV (E. S. Gelsema and L. N. Kanal, eds.), Elsevier Science B.V., 1994, pp. 403–413 (1994) 
  5. Fukunaga K., Introduction to Statistical Pattern Recognition, Academic Press, New York 1990 Zbl0711.62052MR1075415
  6. Graham M. W., Miller D. J., Unsupervised learning of parsimonious mixtures on large spaces with integrated feature and component selection, IEEE Trans. Signal Process. 54 (2006), 4, 1289–1303 
  7. Hodr R., Nikl J., Řeháková B., Veselý, A., Zvárová J., Possibilities of a prognostic assessment quoad vitam in low birth weight newborns, Acta Facult. Med. Univ. Brunesis 58 (1977), 345–358 (1977) 
  8. Chen X., An improved branch and bound algorithm for feature selection, Pattern Recognition Lett. 24 (2003), 12, 1925–1933 (1925) 
  9. Jain A. K., Zongker D., Feature selection: Evaluation, application and small sample performance, IEEE Trans. Pattern Anal. Mach. Intell. 19 (1997), 2, 153–158 (1997) 
  10. Jain A. K., Duin R. P. W., Mao J., Statistical pattern eecognition: A review, IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000), 2, 4–37 
  11. Kohavi R., John G. H., Wrappers for feature subset selection, Artificial Intelligence 97 (1997), 1–2, 273–324 (1997) Zbl0904.68143
  12. Kudo M., Sklansky J., Comparison of algorithms that select features for pattern classifiers, Pattern Recognition 33 (2000), 1, 25–41 
  13. Law M. H., Figueiredo M. A. T., Jain A. K., Simultaneous feature selection and clustering using mixture models, IEEE Trans. Pattern Anal. Mach. Intell. 26 (2004), 1154–1166 
  14. Liu H., Yu L., Toward integrating feature selection algorithms for classification and clustering, IEEE Trans. Knowledge Data Engrg. 17 (2005), 491–502 
  15. Mayer H. A., Somol P., Huber, R., Pudil P., Improving statistical measures of feature subsets by conventional and evolutionary approaches, In: Proc. 3rd IAPR Internat. Workshop on Statistical Techniques in Pattern Recognition, Alicante 2000, pp. 77–81 Zbl0996.68593
  16. McKenzie P., Alder M., Initializing the EM Algorithm for Use in Gaussian Mixture Modelling, University of Western Australia, 1994 
  17. McLachlan G. J., Discriminant Analysis and Statistical Pattern Recognition, Wiley, New York 1992 Zbl1108.62317MR1190469
  18. McLachlan G. J., Peel D., Finite Mixture Models, Wiley, New York 2000 Zbl0963.62061MR1789474
  19. Murphy P. M., Aha D. W., UCI Repository of Machine Learning Databases [ftp, ics.uci.edu]. University of California, Depart ment of Information and Computer Science, Irvine 1994 
  20. Narendra P. M., Fukunaga K., A branch and bound algorithm for feature subset selection, IEEE Trans. Computers 26 (1977), 917–922 (1977) 
  21. Novovičová J., Pudil, P., Kittler J., Divergence based feature selection for multimodal class densities, IEEE Trans. Pattern Anal. Mach. Intell. 18 (1996), 2, 218–223 (1996) 
  22. Novovičová J., Pudil P., Feature selection and classification by modified model with latent structure, In: Dealing With Complexity: Neural Network Approach, Springer–Verlag, Berlin 1997, pp. 126–140 (1997) 
  23. Pudil P., Novovičová, J., Kittler J., Floating search methods in feature selection, Pattern Recognition Lett. 15 (1994), 11, 1119–1125 (1994) 
  24. Pudil P., Novovičová, J., Kittler J., Feature selection based on approximation of class densities by finite mixtures of special type, Pattern Recognition 28 (1995), 1389–1398 (1995) 
  25. Pudil P., Novovičová, J., Kittler J., Simultaneous learning of decision rules and important attributes for classification problems in image analysis, Image Vision Computing 12 (1994), 193–198 (1994) 
  26. Sardo L., Kittler J., Model complexity validation for PDF estimation using Gaussian mixtures, In: Proc. 14th Internat. Conference on Pattern Recognition, Vol. 2, 1998, pp. 195–197 (1998) 
  27. Sebban M., Nock R., A Hybrid filter/wrapper approach of feature selection using information theory, Pattern Recognition 35 (2002), 835–846 Zbl0997.68115
  28. Siedlecki W., Sklansky J., On automatic feature selection, Internat. J. Pattern Recognition Artif. Intell. 2 (1988), 2, 197–220 (1988) 
  29. Somol P., Pudil P., Novovičová, J., Paclík P., Adaptive floating search methods in feature selection, Pattern Recognition Lett. 20 (1999), 11 – 13, 1157–1163 (1999) 
  30. Somol P., Pudil P., Oscillating search algorithms for feature selection, In: Proc. 15th IAPR Internat. Conference on Pattern Recognition, 2000, pp. 406–409 
  31. Somol P., Pudil P., Feature Selection Toolbox, Pattern Recognition 35 (2002), 12, 2749–2759 Zbl1029.68606
  32. Somol P., Pudil. P., Kittler J., Fast branch & bound algorithms for optimal feature selection, IEEE Trans. Pattern Anal. Mach. Intell. 26 (2004), 7, 900–912 
  33. Somol P., Pudil, P., Grim J., On prediction mechanisms in fast branch & bound algorithms, In: Lecture Notes in Computer Science 3138, Springer–Verlag, Berlin 2004, pp. 716–724 Zbl1104.68694
  34. Somol P., Novovičová, J., Pudil P., Flexible-hybrid sequential floating search in statistical feature selection, In: Lecture Notes in Computer Science 4109, Springer–Verlag, Berlin 2006, pp. 632–639 
  35. Theodoridis S., Koutroumbas K., Pattern Recognition, Second edition. Academic Press, New York 2003 Zbl1093.68103
  36. Wang Z., Yang, J., Li G., An improved branch & bound algorithm in feature selection, In: Lecture Notes in Computer Science 2639, Springer, Berlin 2003, pp. 549–556 Zbl1026.68591
  37. Webb A., Statistical Pattern Recognition, Second edition. Wiley, New York 2002 Zbl1237.68006MR2191640
  38. Yu B., Yuan B., A more efficient branch and bound algorithm for feature selection, Pattern Recognition 26 (1993), 883–889 (1993) 
  39. Yu L., Liu H., Feature selection for high-dimensional data: A fast correlation-based filter solution, In: Proc. 20th Internat. Conf. Machine Learning, 2003, pp. 856–863 
  40. Benda J. Zvárová a J., Systém programů TIBIS, Ústav hematologie a krevní transfuze, Praha 1975 (in Czech) (1975) 
  41. Zvárová J., Perez A., Nikl, J., Jiroušek R., Data reduction in computer-aided medical decision-making, In: MEDINFO 83 (J. H. van Bemmel, M. J. Ball, and O. Wigertz, eds.), North Holland, Amsterdam 1983, pp. 450–453 (1983) 
  42. Zvárová J., Studený M., Information theoretical approach to constitution and reduction of medical data, Internat. J. Medical Informatics 45 (1997), 1 – 2, 65–74 (1997) 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.