Learning the naive Bayes classifier with optimization models

Sona Taheri; Musa Mammadov

International Journal of Applied Mathematics and Computer Science (2013)

  • Volume: 23, Issue: 4, page 787-795
  • ISSN: 1641-876X

Abstract

top
Naive Bayes is among the simplest probabilistic classifiers. It often performs surprisingly well in many real world applications, despite the strong assumption that all features are conditionally independent given the class. In the learning process of this classifier with the known structure, class probabilities and conditional probabilities are calculated using training data, and then values of these probabilities are used to classify new observations. In this paper, we introduce three novel optimization models for the naive Bayes classifier where both class probabilities and conditional probabilities are considered as variables. The values of these variables are found by solving the corresponding optimization problems. Numerical experiments are conducted on several real world binary classification data sets, where continuous features are discretized by applying three different methods. The performances of these models are compared with the naive Bayes classifier, tree augmented naive Bayes, the SVM, C4.5 and the nearest neighbor classifier. The obtained results demonstrate that the proposed models can significantly improve the performance of the naive Bayes classifier, yet at the same time maintain its simple structure.

How to cite

top

Sona Taheri, and Musa Mammadov. "Learning the naive Bayes classifier with optimization models." International Journal of Applied Mathematics and Computer Science 23.4 (2013): 787-795. <http://eudml.org/doc/262425>.

@article{SonaTaheri2013,
abstract = {Naive Bayes is among the simplest probabilistic classifiers. It often performs surprisingly well in many real world applications, despite the strong assumption that all features are conditionally independent given the class. In the learning process of this classifier with the known structure, class probabilities and conditional probabilities are calculated using training data, and then values of these probabilities are used to classify new observations. In this paper, we introduce three novel optimization models for the naive Bayes classifier where both class probabilities and conditional probabilities are considered as variables. The values of these variables are found by solving the corresponding optimization problems. Numerical experiments are conducted on several real world binary classification data sets, where continuous features are discretized by applying three different methods. The performances of these models are compared with the naive Bayes classifier, tree augmented naive Bayes, the SVM, C4.5 and the nearest neighbor classifier. The obtained results demonstrate that the proposed models can significantly improve the performance of the naive Bayes classifier, yet at the same time maintain its simple structure.},
author = {Sona Taheri, Musa Mammadov},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {Bayesian networks; naive Bayes classifier; optimization; discretization},
language = {eng},
number = {4},
pages = {787-795},
title = {Learning the naive Bayes classifier with optimization models},
url = {http://eudml.org/doc/262425},
volume = {23},
year = {2013},
}

TY - JOUR
AU - Sona Taheri
AU - Musa Mammadov
TI - Learning the naive Bayes classifier with optimization models
JO - International Journal of Applied Mathematics and Computer Science
PY - 2013
VL - 23
IS - 4
SP - 787
EP - 795
AB - Naive Bayes is among the simplest probabilistic classifiers. It often performs surprisingly well in many real world applications, despite the strong assumption that all features are conditionally independent given the class. In the learning process of this classifier with the known structure, class probabilities and conditional probabilities are calculated using training data, and then values of these probabilities are used to classify new observations. In this paper, we introduce three novel optimization models for the naive Bayes classifier where both class probabilities and conditional probabilities are considered as variables. The values of these variables are found by solving the corresponding optimization problems. Numerical experiments are conducted on several real world binary classification data sets, where continuous features are discretized by applying three different methods. The performances of these models are compared with the naive Bayes classifier, tree augmented naive Bayes, the SVM, C4.5 and the nearest neighbor classifier. The obtained results demonstrate that the proposed models can significantly improve the performance of the naive Bayes classifier, yet at the same time maintain its simple structure.
LA - eng
KW - Bayesian networks; naive Bayes classifier; optimization; discretization
UR - http://eudml.org/doc/262425
ER -

References

top
  1. Asuncion, A. and Newman, D. (2007). UCI machine learning repository, http://www.ics.uci.edu/mlearn/mlrepository. 
  2. Campos, M., Fernandez-Luna, Gamez, A. and Puerta, M. (2002). Ant colony optimization for learning Bayesian networks, International Journal of Approximate Reasoning 31(3): 291-311. Zbl1033.68091
  3. Chang, C. and Lin, C. (2001). LIBSVM: A library for support vector machines, http://www.csie.ntu.edu.tw/cjlin/libsvm. 
  4. Chickering, D.M. (1996). Learning Bayesian networks is NP-complete, in D. Fisher and H. Lenz (Eds.), Artificial Intelligence and Statistics, Springer-Verlag, Berlin/Heidelberg, pp. 121-130. 
  5. Crawford, E., Kay, J. and Eric, M. (2002). The intelligent email sorter, Proceedings of the 19th International Conference on Machine Learning, Sydney, Australia, pp. 83-90. 
  6. Domingos, P. and Pazzani, M. (1996). Beyond independence: Conditions for the optimality of the simple Bayesian classifier, Proceedings of the 13th International Conference on Machine Learning, Bari, Italy, pp. 105-112. 
  7. Domingos, P. and Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss, Machine Learning (29): 103-130. Zbl0892.68076
  8. Dougherty, J., Kohavi, R. and Sahami, M. (1995). Supervised and unsupervised discretization of continuous features, Proceedings of the 12th International Conference on Machine Learning, San Francisco, CA, USA, pp. 194-202. 
  9. Fayyad, U.M. and Irani, K. (1993). On the handling of continuous-valued attributes in decision tree generation, Machine Learning 8: 87-102. Zbl0767.68084
  10. Friedman, N., Geiger, D. and Goldszmidti, M. (1997). Bayesian network classifiers, Machine Learning 29(2): 131-163. Zbl0892.68077
  11. Heckerman, D., Chickering, D. and Meek, C. (2004). Large sample learning of Bayesian networks is NP-hard, Journal of Machine Learning Research 5: 1287-1330. Zbl1222.68169
  12. Kononenko, I. (2001). Machine learning for medical diagnosis: History, state of the art and perspective, Artificial Intelligence in Medicine 23: 89-109. 
  13. Langley, P., Iba, W. and Thompson, K. (1992). An analysis of Bayesian classifiers, 10th International Conference on Artificial Intelligence, San Jose, CA, USA, pp. 223-228. 
  14. Miyahara, K. and Pazzani, M.J. (2000). Collaborative filtering with the simple Bayesian classifier, Proceedings of the 6th Pacific Rim International Conference on Artificial Intelligence, Melbourne, Australia, pp. 679-689. 
  15. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Fransisco, CA. Zbl0746.68089
  16. Polanska, J., Borys, D. and Polanski, A. (2006). Node assignment problem in Bayesian networks, International Journal of Applied Mathematics and Computer Science 16(2): 233-240. Zbl1147.62389
  17. Taheri, S. and Mammadov, M. (2012). Structure learning of Bayesian networks using a new unrestricted dependency algorithm, IMMM 2012: The 2nd International Conference on Advances in Information on Mining and Management, Venice, Italy, pp. 54-59. 
  18. Taheri, S., Mammadov, M. and Bagirov, A. (2011). Improving naive Bayes classifier using conditional probabilities, 9th Australasian Data Mining Conference, Ballarat, Australia, pp. 63-68. 
  19. Taheri, S., Mammadov, M. and Seifollahi, S. (2012). Globally convergent algorithms for solving unconstrained optimization problems, Optimization: 1-15. Zbl1311.90182
  20. Tóth, L., Kocsor, A. and Csirik, J. (2005). On naive Bayes in speech recognition, International Journal of Applied Mathematics and Computer Science 15(2): 287-294. Zbl1085.68667
  21. Wu, X., Vipin Kumar, J., Quinlan, R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, J., Ng, A., Liu, B., Yu, P. S., Zhou, Z., Steinbach, M., Hand, D. J. and Steinberg, D. (2008). Top 10 algorithms in data mining, Knowledge and Information Systems 14: 1-37. 
  22. Yatsko, A., Bagirov, A.M. and Stranieri, A. (2011). On the discretization of continuous features for classification, Proceedings of the 9th Australasian Data Mining Conference (AusDM 2011), Ballarat, Australia, Vol. 125. 
  23. Zaidi, A., Ould Bouamama, B. and Tagina, M. (2012). Bayesian reliability models of Weibull systems: State of the art, International Journal of Applied Mathematics and Computer Science 22(3): 585-600, DOI: 10.2478/v10006-012-0045-2. Zbl1333.60192
  24. Zupan, B., Demsar, J., Kattan, M.W., Ohori, M., Graefen, M., Bohanec, M. and Beck, J.R. (2001). Orange and decisions-at-hand: Bridging predictive data mining and decision support, Proceedings of the ECML/PKDD Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning, Freiburg, Germany, pp. 151-162. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.