Employing different loss functions for the classification of images via supervised learning

Radu Boţ; André Heinrich; Gert Wanka

Open Mathematics (2014)

  • Volume: 12, Issue: 2, page 381-394
  • ISSN: 2391-5455

Abstract

top
Supervised learning methods are powerful techniques to learn a function from a given set of labeled data, the so-called training data. In this paper the support vector machines approach is applied to an image classification task. Starting with the corresponding Tikhonov regularization problem, reformulated as a convex optimization problem, we introduce a conjugate dual problem to it and prove that, whenever strong duality holds, the function to be learned can be expressed via the dual optimal solutions. Corresponding dual problems are then derived for different loss functions. The theoretical results are applied by numerically solving a classification task using high dimensional real-world data in order to obtain optimal classifiers. The results demonstrate the excellent performance of support vector classification for this particular problem.

How to cite

top

Radu Boţ, André Heinrich, and Gert Wanka. "Employing different loss functions for the classification of images via supervised learning." Open Mathematics 12.2 (2014): 381-394. <http://eudml.org/doc/269240>.

@article{RaduBoţ2014,
abstract = {Supervised learning methods are powerful techniques to learn a function from a given set of labeled data, the so-called training data. In this paper the support vector machines approach is applied to an image classification task. Starting with the corresponding Tikhonov regularization problem, reformulated as a convex optimization problem, we introduce a conjugate dual problem to it and prove that, whenever strong duality holds, the function to be learned can be expressed via the dual optimal solutions. Corresponding dual problems are then derived for different loss functions. The theoretical results are applied by numerically solving a classification task using high dimensional real-world data in order to obtain optimal classifiers. The results demonstrate the excellent performance of support vector classification for this particular problem.},
author = {Radu Boţ, André Heinrich, Gert Wanka},
journal = {Open Mathematics},
keywords = {Machine learning; Tikhonov regularization; Conjugate duality; Image classification; conjugate duality; machine learning; image classification},
language = {eng},
number = {2},
pages = {381-394},
title = {Employing different loss functions for the classification of images via supervised learning},
url = {http://eudml.org/doc/269240},
volume = {12},
year = {2014},
}

TY - JOUR
AU - Radu Boţ
AU - André Heinrich
AU - Gert Wanka
TI - Employing different loss functions for the classification of images via supervised learning
JO - Open Mathematics
PY - 2014
VL - 12
IS - 2
SP - 381
EP - 394
AB - Supervised learning methods are powerful techniques to learn a function from a given set of labeled data, the so-called training data. In this paper the support vector machines approach is applied to an image classification task. Starting with the corresponding Tikhonov regularization problem, reformulated as a convex optimization problem, we introduce a conjugate dual problem to it and prove that, whenever strong duality holds, the function to be learned can be expressed via the dual optimal solutions. Corresponding dual problems are then derived for different loss functions. The theoretical results are applied by numerically solving a classification task using high dimensional real-world data in order to obtain optimal classifiers. The results demonstrate the excellent performance of support vector classification for this particular problem.
LA - eng
KW - Machine learning; Tikhonov regularization; Conjugate duality; Image classification; conjugate duality; machine learning; image classification
UR - http://eudml.org/doc/269240
ER -

References

top
  1. [1] Aronszajn N., Theory of reproducing kernels, Trans. Amer. Math. Soc., 1950, 68, 337–404 http://dx.doi.org/10.1090/S0002-9947-1950-0051437-7 Zbl0037.20701
  2. [2] Boţ R.I., Conjugate Duality in Convex Optimization, Lecture Notes in Econom. and Math. Systems, 637, Springer, Berlin-Heidelberg, 2010 Zbl1190.90002
  3. [3] Boţ R.I., Csetnek E.R., Heinrich A., On the convergence rate improvement of a primal-dual splitting algorithm for solving monotone inclusion problems, SIAM J. Optim., 2013, 23(4), 2011–2036 http://dx.doi.org/10.1137/12088255X Zbl1314.47102
  4. [4] Boţ R.I., Lorenz N., Optimization problems in statistical learning: Duality and optimality conditions, European J. Oper. Res., 2011, 213(2), 395–404 http://dx.doi.org/10.1016/j.ejor.2011.03.021 Zbl1222.90079
  5. [5] Boyd S., Parikh N., Chu E., Peleato B., Eckstein J., Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trendsc in Machine Learning, 2010, 3(1), 1–122 http://dx.doi.org/10.1561/2200000016 Zbl1229.90122
  6. [6] Chapelle O., Haffner P., Vapnik V.N., Support vector machines for histogramm-based image classification, IEEE Trans. Neural Networks, 1999, 10(5), 1055–1064 http://dx.doi.org/10.1109/72.788646 
  7. [7] Geiger C., Kanzow C., Theorie und Numerik Restringierter Optimierungsaufgaben, Springer-Lehrbuch Masterclass, Springer, Berlin-New York, 2002 http://dx.doi.org/10.1007/978-3-642-56004-0 
  8. [8] Joachims T., Learning to Classify Text Using Support Vector Machines, Kluwer Internat. Ser. Engrg. Comput. Sci., 668, Kluwer, Boston, 2002 http://dx.doi.org/10.1007/978-1-4615-0907-3 
  9. [9] Kim K., Financial time series forecasting using support vector machines, Neurocomputing, 2003, 55(1–2), 307–319 http://dx.doi.org/10.1016/S0925-2312(03)00372-2 
  10. [10] Lal T.N., Chapelle O., Schölkopf B., Combining a filter method with SVMs, In: Feature Extraction, Studies in Fuzziness and Soft Computing, 207, Springer, Berlin, 2006, 439–445 http://dx.doi.org/10.1007/978-3-540-35488-8_21 
  11. [11] Mota J.F.C., Xavier J.M.F., Aguiar P.M.Q., Püschel M., D-ADMM: A communication-efficient distributed algorithm for separable optimization, preprint available at http://arxiv.org/abs/1202.2805 
  12. [12] Noble W.S., Support vector machine application in computational biology, In: Kernel Methods in Computational Biology, MIT Press, 2004, 71–92 
  13. [13] Pedroso J.P., Murata N., Support vector machines with different norms: motivation, formulations and results, Pattern Recognition Lett., 2001, 22(12), 1263–1272 http://dx.doi.org/10.1016/S0167-8655(01)00071-X Zbl0990.68117
  14. [14] Rifkin R.M., Lippert R.A., Value regularization and Fenchel duality, J. Mach. Learn. Res., 2007, 8(3), 441–479 Zbl1222.49052
  15. [15] Rockafellar R.T., Convex Analysis, Princeton Math. Ser., 28, Princeton University Press, Princeton, 1970 Zbl0193.18401
  16. [16] Ruschhaupt M., Huber W., Poustka A., Mansmann U., A compendium to ensure computational reproducibility in high-dimensional classification tasks, Stat. Appl. Genet. Mol. Biol., 2004, 3, #37 Zbl1082.92025
  17. [17] Shawe-Taylor J., Cristianini N., Kernel Methods for Pattern Analysis, Cambridge University Press, Cambridge, 2004 http://dx.doi.org/10.1017/CBO9780511809682 Zbl0994.68074
  18. [18] Sra S., Nowozin S., Wright S.J., Optimization for Machine Learning, MIT Press, Cambridge-London, 2011 
  19. [19] Steinwart I., How to compare different loss functions and their risks, Constr. Approx., 2007, 26(2), 225–287 http://dx.doi.org/10.1007/s00365-006-0662-3 Zbl1127.68089
  20. [20] Stoppiglia H., Dreyfus G., Dubois R., Oussar Y., Ranking a random feature for variable and feature selection, Journal of Machine Learning Research, 2003, 3, 1399–1414 Zbl1102.68598
  21. [21] Suykens J.A.K., Vandewalle J., Least squares support vector machine classifiers, Neural Processing Letters, 1999, 9(3), 293–300 http://dx.doi.org/10.1023/A:1018628609742 
  22. [22] Tibshirani R., Regression shrinkage and selection via the lasso, J. Roy. Statist. Soc. Ser. B, 1996, 58(1), 267–288 Zbl0850.62538
  23. [23] Tibshirani R., Saunders M., Rosset S., Zhu J., Knight K., Sparsity and smoothness via the fused lasso, J. R. Stat. Soc. Ser. B Stat. Methodol., 2005, 67(1), 91–108 http://dx.doi.org/10.1111/j.1467-9868.2005.00490.x Zbl1060.62049
  24. [24] Tikhonov A.N., Arsenin V.Ya., Solutions of Ill-posed Problems, Scripta Series in Mathematics, Winston & Sons, Washington DC, 1977 Zbl0354.65028
  25. [25] Van Gestel T., Baesens B., Garcia J., Van Dijcke P., A support vector machine approach to credit scoring, Banken Financiewezen, 2003, 2, 73–82 
  26. [26] Vapnik V.N., The Nature of Statistical Learning Theory, Springer, New York, 1995 http://dx.doi.org/10.1007/978-1-4757-2440-0 Zbl0833.62008
  27. [27] Vapnik V.N., Statistical Learning Theory, Adapt. Learn. Syst. Signal Process. Commun. Control, John Wiley & Sons, New York, 1998 Zbl0935.62007
  28. [28] Varma S., Simon R., Bias in error estimation when using cross-validation for model selection, BMC Bioinformatics, 2006, 7, #91 
  29. [29] Wahba G., Spline Models for Observational Data, CBMS-NSF Regional Conf. Ser. in Appl. Math., 59, SIAM, Philadelphia, 1990 http://dx.doi.org/10.1137/1.9781611970128 
  30. [30] Xiang D.-H., Zhou D.-X., Classification with Gaussians and convex loss, J. Mach. Learn. Res., 2009, 10, 1447–1468 Zbl1235.68207
  31. [31] Ying Y., Huang K., Campbell C., Sparse metric learning via smooth optimization, In: Advances in Neural Information Processing Systems, 22, NIPS Foundation, 2009, 2205–2213 
  32. [32] MATLAB 8.0, The MathWorks, 2012, Natick, Massachusetts 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.