Improving feature selection process resistance to failures caused by curse-of-dimensionality effects
Petr Somol; Jiří Grim; Jana Novovičová; Pavel Pudil
Kybernetika (2011)
- Volume: 47, Issue: 3, page 401-425
- ISSN: 0023-5954
Access Full Article
topAbstract
topHow to cite
topSomol, Petr, et al. "Improving feature selection process resistance to failures caused by curse-of-dimensionality effects." Kybernetika 47.3 (2011): 401-425. <http://eudml.org/doc/196450>.
@article{Somol2011,
abstract = {The purpose of feature selection in machine learning is at least two-fold - saving measurement acquisition costs and reducing the negative effects of the curse of dimensionality with the aim to improve the accuracy of the models and the classification rate of classifiers with respect to previously unknown data. Yet it has been shown recently that the process of feature selection itself can be negatively affected by the very same curse of dimensionality - feature selection methods may easily over-fit or perform unstably. Such an outcome is unlikely to generalize well and the resulting recognition system may fail to deliver the expectable performance. In many tasks, it is therefore crucial to employ additional mechanisms of making the feature selection process more stable and resistant the curse of dimensionality effects. In this paper we discuss three different approaches to reducing this problem. We present an algorithmic extension applicable to various feature selection methods, capable of reducing excessive feature subset dependency not only on specific training data, but also on specific criterion function properties. Further, we discuss the concept of criteria ensembles, where various criteria vote about feature inclusion/removal and go on to provide a general definition of feature selection hybridization aimed at combining the advantages of dependent and independent criteria. The presented ideas are illustrated through examples and summarizing recommendations are given.},
author = {Somol, Petr, Grim, Jiří, Novovičová, Jana, Pudil, Pavel},
journal = {Kybernetika},
keywords = {feature selection; curse of dimensionality; over-fitting; stability; machine learning; dimensionality reduction; stability; machine learning; curse of dimensionality; over-fitting; dimensionality reduction},
language = {eng},
number = {3},
pages = {401-425},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Improving feature selection process resistance to failures caused by curse-of-dimensionality effects},
url = {http://eudml.org/doc/196450},
volume = {47},
year = {2011},
}
TY - JOUR
AU - Somol, Petr
AU - Grim, Jiří
AU - Novovičová, Jana
AU - Pudil, Pavel
TI - Improving feature selection process resistance to failures caused by curse-of-dimensionality effects
JO - Kybernetika
PY - 2011
PB - Institute of Information Theory and Automation AS CR
VL - 47
IS - 3
SP - 401
EP - 425
AB - The purpose of feature selection in machine learning is at least two-fold - saving measurement acquisition costs and reducing the negative effects of the curse of dimensionality with the aim to improve the accuracy of the models and the classification rate of classifiers with respect to previously unknown data. Yet it has been shown recently that the process of feature selection itself can be negatively affected by the very same curse of dimensionality - feature selection methods may easily over-fit or perform unstably. Such an outcome is unlikely to generalize well and the resulting recognition system may fail to deliver the expectable performance. In many tasks, it is therefore crucial to employ additional mechanisms of making the feature selection process more stable and resistant the curse of dimensionality effects. In this paper we discuss three different approaches to reducing this problem. We present an algorithmic extension applicable to various feature selection methods, capable of reducing excessive feature subset dependency not only on specific training data, but also on specific criterion function properties. Further, we discuss the concept of criteria ensembles, where various criteria vote about feature inclusion/removal and go on to provide a general definition of feature selection hybridization aimed at combining the advantages of dependent and independent criteria. The presented ideas are illustrated through examples and summarizing recommendations are given.
LA - eng
KW - feature selection; curse of dimensionality; over-fitting; stability; machine learning; dimensionality reduction; stability; machine learning; curse of dimensionality; over-fitting; dimensionality reduction
UR - http://eudml.org/doc/196450
ER -
References
top- Brown, G., A new perspective for information theoretic feature selection, In: Proc. AISTATS ’09, JMLR: W&CP 5 (2009), pp. 49–56. (2009)
- Chang, Ch.-Ch., Lin, Ch.-J., LIBSVM: A Library for SVM, 2001, http://www.csie.ntu.edu.tw/~cjlin/libsvm.
- Das, S., Filters, wrappers and a boosting-based hybrid for feature selection, In: Proc. 18th Int. Conf. on Machine Learning (ICML ’01), Morgan Kaufmann Publishers Inc. 2001, pp. 74–81. (2001)
- Dash, M., Choi, K., Scheuermann, P., Liu, H., Feature selection for clustering – a filter solution, In: Proc. 2002 IEEE Int. Conf. on Data Mining (ICDM ’02), Vol. 00, IEEE Comp. Soc. 2002, p. 115. (2002)
- Devijver, P. A., Kittler, J., Pattern Recognition: A Statistical Approach, Prentice Hall 1982. (1982) Zbl0542.68071MR0692767
- Dutta, D., Guha, R., Wild, D., Chen, T., Ensemble feature selection: Consistent descriptor subsets for multiple qsar models, J. Chem. Inf. Model. 43 (2007), 3, pp. 989–997. (2007)
- C. Emmanouilidis, Multiple-criteria genetic algorithms for feature selection in neuro-fuzzy modeling, In: Internat. Conf. on Neural Networks, Vol. 6, 1999, pp. 4387–4392. (1999)
- Frank, A., Asuncion, A., HASH(0x3185490), UCI Machine Learning Repository, 2010. (2010)
- Gheyas, I. A., Smith, L. S., 10.1016/j.patcog.2009.06.009, Pattern Recognition 43 (2010), 1, 5–13. (2010) DOI10.1016/j.patcog.2009.06.009
- Glover, F. W., Kochenberger, G. A., eds., Handbook of Metaheuristics, Internat. Ser. Operat. Research & Management Science 5, Springer 2003. (2003) Zbl1058.90002MR1975894
- Günter, S., Bunke, H., An evaluation of ensemble methods in handwritten word recog, based on feature selection. In: Proc. ICPR ’04, IEEE Comp. Soc. 2004, pp. 388–392. (2004)
- Guyon, I., Elisseeff, A., An introduction to variable and feature selection, J. Mach. Learn. Res. 3 (2003), 1157–1182. (2003) Zbl1102.68556
- Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. A., eds., Feature Extraction – Foundations and Applications, Studies in Fuzziness and Soft Comp. 207 Physica, Springer 2006. (2006) Zbl1114.68059
- Ho, Tin Kam, 10.1109/34.709601, IEEE Trans. PAMI 20 (1998), 8, 832–844. (1998) DOI10.1109/34.709601
- Hussein, F., Ward, R., Kharma, N., Genetic algorithms for feature selection and weighting, a review and study, In: Proc. 6th ICDAR, Vol. 00, IEEE Comp. Soc. 2001, pp. 1240–1244. (2001)
- Jensen, R., Performing feature selection with ACO, Studies Comput. Intelligence 34, Springer 2006, pp. 45–73. (2006)
- Special issue on variable and feature selection., J. Machine Learning Research. http://www.jmlr.org/papers/special/feature.html, 2003. (2003)
- Kalousis, A., Prados, J., Hilario, M., 10.1007/s10115-006-0040-8, Knowledge Inform. Systems 12 (2007), 1, 95–116. (2007) DOI10.1007/s10115-006-0040-8
- Kittler, J., Hatef, M., Duin, R. P. W., Matas, J., 10.1109/34.667881, IEEE Trans. PAMI 20 (1998), 3, 226–239. (1998) DOI10.1109/34.667881
- Kohavi, R., John, G. H., 10.1016/S0004-3702(97)00043-X, Artificial Intelligence 97 (1997), 1–2, 273–324. (1997) Zbl0904.68143DOI10.1016/S0004-3702(97)00043-X
- Kononenko, I., Estimating attributes: Analysis and extensions of RELIEF, In: Proc. ECML-94, Springer 1994, pp. 171–182. (1994)
- Kuncheva, L. I., A stability index for feature selection, In: Proc. 25th IASTED Internat. Mul.-Conf. AIAP’07, ACTA Pr. 2007, pp. 390–395. (2007)
- Lai, C., Reinders, M. J. T., Wessels, L., 10.1016/j.patrec.2005.12.018, Pattern Recognition Lett. 27 (2006), 10, 1067–1076. (2006) DOI10.1016/j.patrec.2005.12.018
- Liu, H., Motoda, H., Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers 1998. (1998) Zbl0908.68127
- Liu, H., Yu, L., Toward integrating feature selection algorithms for classification and clustering, IEEE Trans. KDE 17 (2005), 4, 491–502. (2005)
- Nakariyakul, S., Casasent, D. P., 10.1016/j.patrec.2007.02.015, Pattern Recognition Lett. 28 (2007), 12, 1415–1427. (2007) DOI10.1016/j.patrec.2007.02.015
- Nakariyakul, S., Casasent, D. P., 10.1016/j.patcog.2008.11.018, Pattern Recognition 42 (2009), 9, 1932–1940. (2009) Zbl1178.68503DOI10.1016/j.patcog.2008.11.018
- Novovičová, J., Pudil, P., Kittler, J., 10.1109/34.481557, IEEE Trans. PAMI 18 (1996), 2, 218–223. (1996) DOI10.1109/34.481557
- Pudil, P., Novovičová, J., Choakjarernwanit, N., Kittler, J., 10.1016/0031-3203(94)00009-B, Pattern Recognition 28 (1995), 9, 1389–1398. (1995) DOI10.1016/0031-3203(94)00009-B
- Pudil, P., Novovičová, J., Kittler, J., 10.1016/0167-8655(94)90127-9, Pattern Recognition Lett. 15 (1994), 11, 1119–1125. (1994) DOI10.1016/0167-8655(94)90127-9
- Raudys, Š. J., Feature over-selection, In: Proc. S+SSPR, Lecture Notes in Comput. Sci. 4109, Springer 2006, pp. 622–631. (2006)
- al., V. C. Raykar et, Bayesian multiple instance learning: automatic feature selection and inductive transfer, In: Proc. ICML ’08, ACM 2008, pp. 808–815. (2008)
- Reunanen, J., A pitfall in determining the optimal feature subset size, In: Proc. 4th Internat. Workshop on Pat. Rec. in Inf. Systs (PRIS 2004), pp. 176–185. (2004)
- Reunanen, J., Less biased measurement of feature selection benefits, In: Stat. and Optimiz. Perspectives Workshop, SLSFS, Lecture Notes in Comput. Sci. 3940, Springer 2006, pp. 198–208. (2006)
- Saeys, Y., Inza, I., Larrañaga, P., 10.1093/bioinformatics/btm344, Bioinformatics 23 (2007), 19, 2507–2517. (2007) DOI10.1093/bioinformatics/btm344
- Salappa, A., Doumpos, M., Zopounidis, C., 10.1080/10556780600881910, Optimiz. Methods Software 22 (2007), 1, 199–212. (2007) Zbl1116.62069MR2272838DOI10.1080/10556780600881910
- Sebastiani, F., 10.1145/505282.505283, ACM Comput. Surveys 34 (2002), 1, 1–47. (2002) DOI10.1145/505282.505283
- Sebban, M., Nock, R., 10.1016/S0031-3203(01)00084-X, Pattern Recognition 35 (2002), 835–846. (2002) Zbl0997.68115DOI10.1016/S0031-3203(01)00084-X
- Somol, P., Grim, J., Pudil, P., Criteria ensembles in feature selection, In: Proc. MCS, Lecture Notes in Comput. Sci. 5519, Springer 2009, pp. 304–313. (2009)
- Somol, P., Grim, J., Pudil, P., The problem of fragile feature subset preference in feature selection methods and a proposal of algorithmic workaround, In: ICPR 2010. IEEE Comp. Soc. 2010. (2010)
- Somol, P., Novovičová, J., Pudil, P., Flexible-hybrid sequential floating search in statistical feature selection, In: Proc. S+SSPR, Lecture Notes in Comput. Sci. 4109, Springer 2006, pp. 632–639. (2006)
- Somol, P., Novovičová, J., Evaluating the stability of feature selectors that optimize feature subset cardinality, In: Proc. S+SSPR, Lecture Notes in Comput. Sci. 5342 Springer 2008, pp. 956–966. (2008)
- Somol, P., Novovičová, J., Grim, J., Pudil, P., Dynamic oscillating search algorithms for feature selection, In: ICPR 2008. IEEE Comp. Soc. 2008. (2008)
- Somol, P., Novovičová, J., Pudil, P., Improving sequential feature selection methods performance by means of hybridization, In: Proc. 6th IASTED Int. Conf. on Advances in Computer Science and Engrg. ACTA Press 2010. (2010)
- Somol, P., Pudil, P., Oscillating search algorithms for feature selection, In: ICPR 2000, IEEE Comp. Soc. 02 (2000), 406–409. (2000)
- Somol, P., Pudil, P., Kittler, J., 10.1109/TPAMI.2004.28, IEEE Trans. on PAMI 26 (2004), 7, 900–912. (2004) DOI10.1109/TPAMI.2004.28
- Sun, Y., 10.1109/TPAMI.2007.1093, IEEE Trans. PAMI 29 (2007), 6, 1035–1051. (2007) DOI10.1109/TPAMI.2007.1093
- al, M.-A. Tahir et, 10.1016/j.patrec.2006.08.016, Patt. Recognition Lett. 28 (2007), 4, 438–446. (2007) DOI10.1016/j.patrec.2006.08.016
- Whitney, A. W., A direct method of nonparametric measurement selection, IEEE Trans. Comput. 20 (1971), 9, 1100–1103. (1971) Zbl0227.68047
- Yang, Y., Pedersen, J. O., A comparative study on feature selection in text categorization, In: Proc. 14th Internat. Conf. on Machine Learning (ICML ’97), Morgan Kaufmann 1997, pp. 412–420. (1997)
- Yu, L., Liu, H., Feature selection for high-dimensional data: A fast correlation-based filter solution, In: Proc. 20th Internat. Conf. on Machine Learning (ICML-03), Vol. 20, Morgan Kaufmann 2003, pp. 856–863. (2003)
- Zhu, Z., Ong, Y. S., Dash, M., 10.1109/TSMCB.2006.883267, IEEE Trans. Systems Man Cybernet., Part B 37 (2007), 1, 70. (2007) DOI10.1109/TSMCB.2006.883267
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.