Rough set-based dimensionality reduction for supervised and unsupervised learning

Qiang Shen; Alexios Chouchoulas

International Journal of Applied Mathematics and Computer Science (2001)

  • Volume: 11, Issue: 3, page 583-601
  • ISSN: 1641-876X

Abstract

top
The curse of dimensionality is a damning factor for numerous potentially powerful machine learning techniques. Widely approved and otherwise elegant methodologies used for a number of different tasks ranging from classification to function approximation exhibit relatively high computational complexity with respect to dimensionality. This limits severely the applicability of such techniques to real world problems. Rough set theory is a formal methodology that can be employed to reduce the dimensionality of datasets as a preprocessing step to training a learning system on the data. This paper investigates the utility of the Rough Set Attribute Reduction (RSAR) technique to both supervised and unsupervised learning in an effort to probe RSAR's generality. FuREAP, a Fuzzy-Rough Estimator of Algae Populations, which is an existing integration of RSAR and a fuzzy Rule Induction Algorithm (RIA), is used as an example of a supervised learning system with dimensionality reduction capabilities. A similar framework integrating the Multivariate Adaptive Regression Splines (MARS) approach and RSAR is taken to represent unsupervised learning systems. The paper describes the three techniques in question, discusses how RSAR can be employed with a supervised or an unsupervised system, and uses experimental results to draw conclusions on the relative success of the two integration efforts.

How to cite

top

Shen, Qiang, and Chouchoulas, Alexios. "Rough set-based dimensionality reduction for supervised and unsupervised learning." International Journal of Applied Mathematics and Computer Science 11.3 (2001): 583-601. <http://eudml.org/doc/207521>.

@article{Shen2001,
abstract = {The curse of dimensionality is a damning factor for numerous potentially powerful machine learning techniques. Widely approved and otherwise elegant methodologies used for a number of different tasks ranging from classification to function approximation exhibit relatively high computational complexity with respect to dimensionality. This limits severely the applicability of such techniques to real world problems. Rough set theory is a formal methodology that can be employed to reduce the dimensionality of datasets as a preprocessing step to training a learning system on the data. This paper investigates the utility of the Rough Set Attribute Reduction (RSAR) technique to both supervised and unsupervised learning in an effort to probe RSAR's generality. FuREAP, a Fuzzy-Rough Estimator of Algae Populations, which is an existing integration of RSAR and a fuzzy Rule Induction Algorithm (RIA), is used as an example of a supervised learning system with dimensionality reduction capabilities. A similar framework integrating the Multivariate Adaptive Regression Splines (MARS) approach and RSAR is taken to represent unsupervised learning systems. The paper describes the three techniques in question, discusses how RSAR can be employed with a supervised or an unsupervised system, and uses experimental results to draw conclusions on the relative success of the two integration efforts.},
author = {Shen, Qiang, Chouchoulas, Alexios},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {fuzzy rule induction; knowledge acquisition; knowledge-based systems; rough dimensionality reduction; machine learning},
language = {eng},
number = {3},
pages = {583-601},
title = {Rough set-based dimensionality reduction for supervised and unsupervised learning},
url = {http://eudml.org/doc/207521},
volume = {11},
year = {2001},
}

TY - JOUR
AU - Shen, Qiang
AU - Chouchoulas, Alexios
TI - Rough set-based dimensionality reduction for supervised and unsupervised learning
JO - International Journal of Applied Mathematics and Computer Science
PY - 2001
VL - 11
IS - 3
SP - 583
EP - 601
AB - The curse of dimensionality is a damning factor for numerous potentially powerful machine learning techniques. Widely approved and otherwise elegant methodologies used for a number of different tasks ranging from classification to function approximation exhibit relatively high computational complexity with respect to dimensionality. This limits severely the applicability of such techniques to real world problems. Rough set theory is a formal methodology that can be employed to reduce the dimensionality of datasets as a preprocessing step to training a learning system on the data. This paper investigates the utility of the Rough Set Attribute Reduction (RSAR) technique to both supervised and unsupervised learning in an effort to probe RSAR's generality. FuREAP, a Fuzzy-Rough Estimator of Algae Populations, which is an existing integration of RSAR and a fuzzy Rule Induction Algorithm (RIA), is used as an example of a supervised learning system with dimensionality reduction capabilities. A similar framework integrating the Multivariate Adaptive Regression Splines (MARS) approach and RSAR is taken to represent unsupervised learning systems. The paper describes the three techniques in question, discusses how RSAR can be employed with a supervised or an unsupervised system, and uses experimental results to draw conclusions on the relative success of the two integration efforts.
LA - eng
KW - fuzzy rule induction; knowledge acquisition; knowledge-based systems; rough dimensionality reduction; machine learning
UR - http://eudml.org/doc/207521
ER -

References

top
  1. Bartels R., Beatty J. and Barsky B. (1987): Splines for Use in Computer Graphics and Geometric Modeling. - Los Altos: Morgan Kaufmann. Zbl0682.65003
  2. Chouchoulas A. and Shen Q. (1998): Rough set-aided rule induction for plant monitoring. - Proc. Int. Joint Conf. Information Science (JCIS'98), Research Triangle Park, NC, Vol.2, pp.316-319. 
  3. ERUDIT, European Network for Fuzzy Logic and Uncertainty Modeling in Information Technology. Protecting Rivers and Streams by Monitoring Chemical Concentrations and Algae Communities (Third International Competition) http://www.erudit.de/erudit/activities/ic-99/problem.htm 
  4. Foley J.D., van Dam A., Feiner S.K., Hughes J.F. and Philips R.L. (1990): Introduction to Computer Graphics. - Reading: Addison-Wesley. Zbl0826.68123
  5. Friedman J.H. (1991): Multivariate adaptive regression splines. - Annals of Statistics, Vol.19, No.1, pp1-67. Zbl0765.62064
  6. Haykin S. (1994): Neural Networks. - New York: Macmillan College Publ. Comp. Zbl0828.68103
  7. Jelonek J., Krawiec K. and Slowinski R. (1995): Rough set reduction of attributes and their domains for neural networks. - Comput. Intell., Vol.11, No.2, pp.339-347. 
  8. Lozowski A., Cholewo T.J. and Zurada J.M. (1996): Crisp rule extraction from perceptron network classifiers. - Proc. Int. Conf. Neural Networks, Washington, volume of plenary, panel and special sessions, pp.94-99. 
  9. Mitchell T.M. (1997): Machine Learning. - New York: McGraw-Hill. Zbl0913.68167
  10. Pawlak Z. (1991): Rough Sets: Theoretical Aspects of Reasoning About Data. - Dordrecht: Kluwer. Zbl0758.68054
  11. Quinlan J.R. (1993): C4.5: Programs for Machine Learning. - San Mateo: Morgan Kaufmann. 
  12. van Rijsbergen C.J. (1979): Information Retrieval. - London: Butterworths. Zbl0227.68052
  13. Ripley B.D. (1996): Pattern Recognition and Neural Networks. - Cambridge: Cambridge University Press. Zbl0853.62046
  14. Shen Q. and Chouchoulas A. (1999): Data-driven fuzzy rule induction and its application to systems monitoring. - Proc. 8-th IEEE Int. Conf. Fuzzy Systems, Seoul, Korea, Vol.2, pp.928-933. 
  15. Shen Q. and Chouchoulas A. (2000): A modular approach to generating fuzzy rules with reduced attributes for the monitoring of complex systems. - Eng. Appl. Artif. Intell., Vol.13, No.3, pp.263-278. 
  16. Zadeh L. (1975): The concept of a linguistic variable and its application to approximate reasoning - I. - Inform. Sci., Vol.8, No.1, pp.199-249. Zbl0397.68071

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.