A rough set-based knowledge discovery process

Ning Zhong; Andrzej Skowron

International Journal of Applied Mathematics and Computer Science (2001)

  • Volume: 11, Issue: 3, page 603-619
  • ISSN: 1641-876X

Abstract

top
The knowledge discovery from real-life databases is a multi-phase process consisting of numerous steps, including attribute selection, discretization of real-valued attributes, and rule induction. In the paper, we discuss a rule discovery process that is based on rough set theory. The core of the process is a soft hybrid induction system called the Generalized Distribution Table and Rough Set System (GDT-RS) for discovering classification rules from databases with uncertain and incomplete data. The system is based on a combination of Generalization Distribution Table (GDT) and the Rough Set methodologies. In the preprocessing, two modules, i.e. Rough Sets with Heuristics (RSH) and Rough Sets with Boolean Reasoning (RSBR), are used for attribute selection and discretization of real-valued attributes, respectively. We use a slope-collapse database as an example showing how rules can be discovered from a large, real-life database.

How to cite

top

Zhong, Ning, and Skowron, Andrzej. "A rough set-based knowledge discovery process." International Journal of Applied Mathematics and Computer Science 11.3 (2001): 603-619. <http://eudml.org/doc/207522>.

@article{Zhong2001,
abstract = {The knowledge discovery from real-life databases is a multi-phase process consisting of numerous steps, including attribute selection, discretization of real-valued attributes, and rule induction. In the paper, we discuss a rule discovery process that is based on rough set theory. The core of the process is a soft hybrid induction system called the Generalized Distribution Table and Rough Set System (GDT-RS) for discovering classification rules from databases with uncertain and incomplete data. The system is based on a combination of Generalization Distribution Table (GDT) and the Rough Set methodologies. In the preprocessing, two modules, i.e. Rough Sets with Heuristics (RSH) and Rough Sets with Boolean Reasoning (RSBR), are used for attribute selection and discretization of real-valued attributes, respectively. We use a slope-collapse database as an example showing how rules can be discovered from a large, real-life database.},
author = {Zhong, Ning, Skowron, Andrzej},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {KDD process; rough sets; hybrid systems; knowledge discovery; real-life databases},
language = {eng},
number = {3},
pages = {603-619},
title = {A rough set-based knowledge discovery process},
url = {http://eudml.org/doc/207522},
volume = {11},
year = {2001},
}

TY - JOUR
AU - Zhong, Ning
AU - Skowron, Andrzej
TI - A rough set-based knowledge discovery process
JO - International Journal of Applied Mathematics and Computer Science
PY - 2001
VL - 11
IS - 3
SP - 603
EP - 619
AB - The knowledge discovery from real-life databases is a multi-phase process consisting of numerous steps, including attribute selection, discretization of real-valued attributes, and rule induction. In the paper, we discuss a rule discovery process that is based on rough set theory. The core of the process is a soft hybrid induction system called the Generalized Distribution Table and Rough Set System (GDT-RS) for discovering classification rules from databases with uncertain and incomplete data. The system is based on a combination of Generalization Distribution Table (GDT) and the Rough Set methodologies. In the preprocessing, two modules, i.e. Rough Sets with Heuristics (RSH) and Rough Sets with Boolean Reasoning (RSBR), are used for attribute selection and discretization of real-valued attributes, respectively. We use a slope-collapse database as an example showing how rules can be discovered from a large, real-life database.
LA - eng
KW - KDD process; rough sets; hybrid systems; knowledge discovery; real-life databases
UR - http://eudml.org/doc/207522
ER -

References

top
  1. Agrawal R., Mannila H., Srikant R., Toivonen H. and Verkano A. (1996): Fast discovery of association rules, In: Advances in Knowledge Discovery and Data Mining (U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, Eds.). — Cambridge, Massachusetts: MIT Press, pp.307–328. 
  2. Bazan J.G. (1998): A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision system, In: Rough Sets in Knowledge Discovery 1: Methodology and Applications (L. Polkowski, A. Skowron, Eds.). — Heidelberg: Physica-Verlag, pp.321–365. Zbl1067.68711
  3. Bazan J.G. and Szczuka M. (2000): RSES and RSESlib—A collection of tools for rough set computations. — Proc. 2nd Int. Conf. Rough Setsand Current Trends in Computing (RSCTC-2000), Banff, pp.74–81. Zbl1014.68825
  4. Chmielewski M.R. and Grzymała-Busse J.W. (1994): Global discretization of attributes as preprocessing for machine learning. — Proc. 3rd Int. Workshop Rough Sets and Soft Computing, San Tose, pp.294–301. Zbl0949.68560
  5. Dong J.Z., Zhong N. and Ohsuga S. (1999a): Probabilistic rough induction: The GDT-RS methodology and algorithms, In: Foundations of Intelligent Systems (Z.W. Ras and A. Skowron, Eds.). — Berlin: Springer, pp.621–629. 
  6. Dong J.Z., Zhong N. and Ohsuga S. (1999b): Using rough sets with heuristics to feature selection, In: New Directions in Rough Sets, Data Mining, Granular-Soft Computing (N. Zhong, A. Skowron, S. Ohsuga, Eds.). — Berlin: Springer, pp.178–187. 
  7. Dougherty J., Kohavi R. and Sahami M. (1995): Supervised and unsupervised discretization of real features. — Proc. 12th Int. Conf. Machine Learning, pp.194–202. 
  8. Fayyad U.M. and Irani K.B. (1992): On the handling of real-valued attributes in decison tree generation. — Machine Learning, Vol.8, pp.87–102. Zbl0767.68084
  9. Fayyad U.M., Piatetsky-Shapiro G. and Smyth P. (1996): From data mining to knowledge discovery: An overview , In: Advances in Knowledge Discovery and Data Mining (U. Fayyad, G. Piatetsky-Shapiro, Eds.). — Cambridge, Massachusetts: MIT Press, pp.1– 36. 
  10. Grzymała-Busse J.W. (1998): Applications of rule induction system LERS , In: Rough Sets in Knowledge Discovery 1: Methodology and Applications (L. Polkowski, A. Skowron, Eds.). — Heidelberg: Physica-Verlag, pp.366–375. Zbl0940.68137
  11. Komorowski J., Pawlak Z., Polkowski L. and Skowron A. (1999): Rough sets: A tutorial , In: Rough Fuzzy Hybridization: A New Trend in Decision Making (S.K. Pal and A. Skowron, Eds.). — Singapore: Springer, pp.3–98. 
  12. Lin T.Y. and Cercone N. (Eds.) (1997): Rough Sets and Data Mining: Analysis of Imprecise Data. — Boston: Kluwer. 
  13. Mitchell T.M. (1997): Machine Learning. — Boston: Mc Graw-Hill. Zbl0913.68167
  14. Nguyen H. Son and Skowron A. (1995): Quantization of real value attributes. — Proc. Int. Workshop Rough Sets and Soft Computing at 2nd Joint Conf. Information Sciences (JCIS’95), Durham, NC, pp.34–37. 
  15. Nguyen H. Son and Skowron A. (1997): Boolean reasoning for feature extraction problems, In: Foundations of Intelligent Systems (Z.W. Ras, A. Skowron, Eds.). — Berlin: Springer, pp.117–126. 
  16. Nguyen H. Son and Nguyen S. Hoa (1998): Discretization methods in data mining, In: Rough Sets in Knowledge Discovery (L. Polkowski, A. Skowron, Eds.). — Heidelberg: PhysicaVerlag, pp.451–482. Zbl0940.68139
  17. Nguyen S.H., Nguyen H.S. Skowron A. (1999): Decomposition of task specification problems, In: Foundations of Intelligent Systems (Z.W. Ras and A. Skowron, Eds.). — Berlin: Springer, pp.310–318. 
  18. Pal S.K. and Skowron A. (Eds.) (1999): Rough Fuzzy Hybridization. — Singapore: Springer. 
  19. Pawlak Z. (1982): Rough sets. — Int. J. Comp. Inf. Sci., Vol.11, pp.341–356. Zbl0501.68053
  20. Pawlak Z. (1991): Rough Sets, Theoretical Aspects of Reasoning about Data. — Boston: Kluwer. Zbl0758.68054
  21. Pawlak Z. and Skowron A. (1993): A rough set approach for decision rules generation. — Proc. Workshop W12: The Management of Uncertainty in AI at 13th IJCAI, see also: Institute of Computer Science, Warsaw University of Technology, ICS Res. Rep., 23/93, pp.1–19. 
  22. Polkowski L. and Skowron A. (1996): Rough mereology: A new paradigm for approximate reasoning. — Int. J. Approx. Reasoning, Vol.15, No.4, pp.333–365. Zbl0938.68860
  23. Polkowski L. and Skowron A. (1999): Towards adaptive calculus of granules, In: Computing with Words in Information/Intelligent Systems 1: Foundations (L.A. Zadeh and J. Kacprzyk, Eds.). — Heidelberg: Physica-Verlag, pp.201–228. Zbl0949.68143
  24. Skowron A. and Rauszer C. (1992): The discernibility matrixes and functions in information systems, In: Intelligent Decision Support (R. Slowinski, Ed.). — Boston: Kluwer, pp.331–362. 
  25. Yao Y.Y. and Zhong N. (1999): Potential Applications of Granular Computing in Knowledge Discovery and Data Mining. — Proc. 5th Int. Conf. Information Systems Analysis and Synthesis (IASA’99), Orlando, pp.573–580. 
  26. Zhong N. and Ohsuga S. (1995): Toward a multi-strategy and cooperative discovery system. — Proc. 1st Int. Conf. Knowledge Discovery and Data Mining (KDD-95), Montreal, pp.337–342. 
  27. Zhong N., Liu C. and Ohsuga S. (1997): A way of increasing both autonomy and versatility of a KDD system, In: Foundations of Intelligent Systems (Z.W. Ras and A. Skowron, Eds.). — Berlin: Springer, pp.94–105. 
  28. Zhong N., Dong J.Z. and Ohsuga S. (1998): Data mining: A probabilistic rough set approach, In: Rough Sets in Knowledge Discovery, Vol.2 (L. Polkowski and A. Skowron, Eds.). — Heidelberg: Physica-Verlag, pp.127–146. 
  29. Zhong N., Skowron A. and Ohsuga S. (Eds.) (1999): New Directions in Rough Sets, Data Mining, and Granular-Soft Computing. — Berlin: Springer. 
  30. Zhong N., Dong J.Z. and Ohsuga S. (2000): Using background knowledge as a bias to control the rule discovery process, In: Principles of Data Mining and Knowledge Discovery (D.A. Zighed, J. Komorowski and J. Zytkow, Eds.). — Berlin: Springer, pp.691–698. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.