Generalized minimizers of convex integral functionals, Bregman distance, Pythagorean identities

Imre Csiszár; František Matúš

Kybernetika (2012)

  • Volume: 48, Issue: 4, page 637-689
  • ISSN: 0023-5954

Abstract

top
Integral functionals based on convex normal integrands are minimized subject to finitely many moment constraints. The integrands are finite on the positive and infinite on the negative numbers, strictly convex but not necessarily differentiable. The minimization is viewed as a primal problem and studied together with a dual one in the framework of convex duality. The effective domain of the value function is described by a conic core, a modification of the earlier concept of convex core. Minimizers and generalized minimizers are explicitly constructed from solutions of modified dual problems, not assuming the primal constraint qualification. A generalized Pythagorean identity is presented using Bregman distance and a correction term for lack of essential smoothness in integrands. Results are applied to minimization of Bregman distances. Existence of a generalized dual solution is established whenever the dual value is finite, assuming the dual constraint qualification. Examples of ‘irregular’ situations are included, pointing to the limitations of generality of certain key results.

How to cite

top

Csiszár, Imre, and Matúš, František. "Generalized minimizers of convex integral functionals, Bregman distance, Pythagorean identities." Kybernetika 48.4 (2012): 637-689. <http://eudml.org/doc/247058>.

@article{Csiszár2012,
abstract = {Integral functionals based on convex normal integrands are minimized subject to finitely many moment constraints. The integrands are finite on the positive and infinite on the negative numbers, strictly convex but not necessarily differentiable. The minimization is viewed as a primal problem and studied together with a dual one in the framework of convex duality. The effective domain of the value function is described by a conic core, a modification of the earlier concept of convex core. Minimizers and generalized minimizers are explicitly constructed from solutions of modified dual problems, not assuming the primal constraint qualification. A generalized Pythagorean identity is presented using Bregman distance and a correction term for lack of essential smoothness in integrands. Results are applied to minimization of Bregman distances. Existence of a generalized dual solution is established whenever the dual value is finite, assuming the dual constraint qualification. Examples of ‘irregular’ situations are included, pointing to the limitations of generality of certain key results.},
author = {Csiszár, Imre, Matúš, František},
journal = {Kybernetika},
keywords = {maximum entropy; moment constraint; generalized primal/dual solutions; normal integrand; minimizing sequence; convex duality; Bregman projection; conic core; generalized exponential family; inference principles; maximum entropy; moment constraint; generalized primal/dual solutions; normal integrand; minimizing sequence; convex duality; Bregman projection; conic core; generalized exponential family; inference principles},
language = {eng},
number = {4},
pages = {637-689},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Generalized minimizers of convex integral functionals, Bregman distance, Pythagorean identities},
url = {http://eudml.org/doc/247058},
volume = {48},
year = {2012},
}

TY - JOUR
AU - Csiszár, Imre
AU - Matúš, František
TI - Generalized minimizers of convex integral functionals, Bregman distance, Pythagorean identities
JO - Kybernetika
PY - 2012
PB - Institute of Information Theory and Automation AS CR
VL - 48
IS - 4
SP - 637
EP - 689
AB - Integral functionals based on convex normal integrands are minimized subject to finitely many moment constraints. The integrands are finite on the positive and infinite on the negative numbers, strictly convex but not necessarily differentiable. The minimization is viewed as a primal problem and studied together with a dual one in the framework of convex duality. The effective domain of the value function is described by a conic core, a modification of the earlier concept of convex core. Minimizers and generalized minimizers are explicitly constructed from solutions of modified dual problems, not assuming the primal constraint qualification. A generalized Pythagorean identity is presented using Bregman distance and a correction term for lack of essential smoothness in integrands. Results are applied to minimization of Bregman distances. Existence of a generalized dual solution is established whenever the dual value is finite, assuming the dual constraint qualification. Examples of ‘irregular’ situations are included, pointing to the limitations of generality of certain key results.
LA - eng
KW - maximum entropy; moment constraint; generalized primal/dual solutions; normal integrand; minimizing sequence; convex duality; Bregman projection; conic core; generalized exponential family; inference principles; maximum entropy; moment constraint; generalized primal/dual solutions; normal integrand; minimizing sequence; convex duality; Bregman projection; conic core; generalized exponential family; inference principles
UR - http://eudml.org/doc/247058
ER -

References

top
  1. S. M. Ali, S. D. Silvey, A general class of coefficients of divergence of one distribution from another., J. Roy. Statist. Soc. Ser. B 28 (1966) 131-142. Zbl0203.19902MR0196777
  2. S. Amari, H. Nagaoka, Methods of Information Geometry., Transl. Math. Monographs 191, Oxford Univ. Press, 2000. Zbl1146.62001MR1800071
  3. S. Amari, A. Cichocki, Information geometry of divergence functions., Bull. Polish Acad. Sci. 58 (2010) 183-194. 
  4. O. Barndorff-Nielsen, Information and Exponential Families in Statistical Theory., Wiley, 1978. Zbl0387.62011MR0489333
  5. H. H. Bauschke, J. M. Borwein, Legendre functions and the method of random Bregman projections., J. Convex Anal. 4 (1997), 27-67. Zbl0894.49019MR1459881
  6. H. H. Bauschke, J. M. Borwein, P. L. Combettes, 10.1142/S0219199701000524, Comm. Contemp. Math. 3 (2001), 615-647. Zbl1032.49025MR1869107DOI10.1142/S0219199701000524
  7. A. Ben-Tal, A. Charnes, A dual optimization framework for some problems of information theory and statistics., Problems Control Inform. Theory 8 (1979), 387-401. Zbl0437.90078MR0553884
  8. J. M. Borwein, A. S. Lewis, 10.1137/0329017, SIAM J. Control Optim. 29 (1991), 325-338. Zbl0797.49030MR1092730DOI10.1137/0329017
  9. J. M. Borwein, A. S. Lewis, 10.1137/0801014, SIAM J. Optim. 1 (1991), 191-205. Zbl0756.41037MR1098426DOI10.1137/0801014
  10. J. M. Borwein, A. S. Lewis, 10.1137/0803012, SIAM J. Optim. 3 (1993), 248-267. MR1215444DOI10.1137/0803012
  11. J. M. Borwein, A. S. Lewis, D. Noll, 10.1287/moor.21.2.442, Math. Oper. Res. 21 (1996), 442-468. MR1397223DOI10.1287/moor.21.2.442
  12. L. M. Bregman, 10.1016/0041-5553(67)90040-7, USSR Comput. Math. and Math. Phys. 7 (1967), 200-217. Zbl0186.23807MR0215617DOI10.1016/0041-5553(67)90040-7
  13. M. Broniatowski, A. Keziou, Minimization of φ -divergences on sets of signed measures., Studia Sci. Math. Hungar. 43 (2006), 403-442. Zbl1121.28004MR2273419
  14. J. P. Burg, Maximum entropy spectral analysis., Paper presented at 37th Meeting of Soc. Explor. Geophysicists, Oklahoma City 1967. 
  15. J. P. Burg, Maximum entropy spectral analysis., Ph.D. Thesis, Dept. Geophysics, Stanford Univ., Stanford 1975. 
  16. Y. Censor, S. A. Zenios, Parallel Optimization., Oxford University Press, New York 1997. Zbl0945.90064MR1486040
  17. N. N. Chentsov, Statistical Decision Rules and Optimal Inference., Transl. Math. Monographs 53, American Math. Soc., Providence 1982. Russian original: Nauka, Moscow 1972. Zbl0484.62008MR0645898
  18. I. Csiszár, Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten., Publ. Math. Inst. Hungar. Acad. Sci. 8 (1963), 85-108. Zbl0124.08703MR0164374
  19. I. Csiszár, Information-type measures of difference of probability distributions and indirect observations., Studia Sci. Math. Hungar. 2 (1967), 299-318. Zbl0157.25802MR0219345
  20. I. Csiszár, 10.1214/aop/1176996454, Ann. Probab. 3 (1975), 146-158. MR0365798DOI10.1214/aop/1176996454
  21. I. Csiszár, 10.1214/aop/1176993227, Ann. Probab. 12 (1984), 768-793. Zbl0544.60011MR0744233DOI10.1214/aop/1176993227
  22. I. Csiszár, 10.1214/aos/1176348385, Ann. Statist. 19 (1991), 2031-2066. Zbl0753.62003MR1135163DOI10.1214/aos/1176348385
  23. I. Csiszár, 10.1007/BF01874442, Acta Math. Hungar. 68 (1995), 1-2, 161-185. Zbl0837.62006MR1320794DOI10.1007/BF01874442
  24. I. Csiszár, F. Gamboa, E. Gassiat, 10.1109/18.796367, IEEE Trans. Inform. Theory 45 (1999), 2253-2270. Zbl0958.94002MR1725114DOI10.1109/18.796367
  25. I. Csiszár, F. Matúš, Convex cores of measures on d ., Studia Sci. Math. Hungar. 38 (2001), 177-190. MR1877777
  26. I. Csiszár, F. Matúš, 10.1109/TIT.2003.810633, IEEE Trans. Inform. Theory 49 (2003), 1474-1490. Zbl1063.94016MR1984936DOI10.1109/TIT.2003.810633
  27. I. Csiszár, F. Matúš, Generalized maximum likelihood estimates for infinite dimensional exponential families., In: Proc. Prague Stochastics'06, Prague 2006, pp. 288-297. 
  28. I. Csiszár, F. Matúš, 10.1007/s00440-007-0084-z, Probab. Theory Related Fields 141 (2008), 213-246. Zbl1133.62039MR2372970DOI10.1007/s00440-007-0084-z
  29. I. Csiszár, F. Matúš, On minimization of entropy functionals under moment constraints., In: Proc. ISIT 2008, Toronto, pp. 2101-2105. 
  30. I. Csiszár, F. Matúš, On minimization of multivariate entropy functionals., In: Proc. ITW 2009, Volos, Greece, pp. 96-100. 
  31. I. Csiszár, F. Matúš, Minimization of entropy functionals revisited., In: Proc. ISIT 2012, Cambridge, MA, pp. 150-154. 
  32. D. Dacunha-Castelle, F. Gamboa, Maximum d'entropie et problème des moments., Ann. Inst. H. Poincaré Probab. Statist. 26 (1990), 567-596. Zbl0788.62007MR1080586
  33. A. P. Dawid, P. D. Grünwald, 10.1214/009053604000000553, Ann. Statist. 32 (2004), 1367-1433. Zbl1048.62008MR2089128DOI10.1214/009053604000000553
  34. S. Eguchi, Information geometry and statistical pattern recognition., Sugaku Expositions, Amer. Math. Soc. 19 (2006), 197-216. MR2279777
  35. B. A. Frigyik, S. Srivastava, M. R. Gupta, 10.1109/TIT.2008.929943, IEEE Trans. Inform. Theory 54 (2008), 5130-5139. MR2589887DOI10.1109/TIT.2008.929943
  36. F. Gamboa, E. Gassiat, 10.1214/aos/1034276632, Ann. Statist. 25 (1997), 1, 328-350. Zbl0871.62010MR1429928DOI10.1214/aos/1034276632
  37. E. T. Jaynes, Information theory and statistical mechanics., Physical Review Ser. II 106 (1957), 620-630. Zbl0084.43701MR0087305
  38. L. Jones, C. Byrne, 10.1109/18.50370, IEEE Trans. Inform. Theory 36 (1990), 23-30. MR1043277DOI10.1109/18.50370
  39. S. Kullback, Information Theory and Statistics., John Wiley and Sons, New York 1959. Zbl0897.62003MR0103557
  40. S. Kullback, R. A. Leibler, 10.1214/aoms/1177729694, Ann. Math. Statist. 22 (1951), 79-86. Zbl0042.38403MR0039968DOI10.1214/aoms/1177729694
  41. C. Léonard, 10.1023/A:1017919422086, Acta Math. Hungar. 93 (2001), 281-325. Zbl1052.49017MR1925356DOI10.1023/A:1017919422086
  42. C. Léonard, Minimizers of energy functionals under not very integrable constraints., J. Convex Anal. 10 (2003), 63-68. MR1999902
  43. C. Léonard, 10.1016/j.jmaa.2008.04.048, J. Math. Anal. Appl. 346 (2008), 183-204. Zbl1152.49039MR2428283DOI10.1016/j.jmaa.2008.04.048
  44. C. Léonard, 10.1051/ps/2009003, ESAIM: Probability and Statistics 14 (2010), 343-381. Zbl1220.60018MR2795471DOI10.1051/ps/2009003
  45. F. Liese, I. Vajda, Convex Statistical Distances., Teubner Texte zur Mathematik 95, Teubner Verlag, Leipzig 1986. Zbl0656.62004MR0926905
  46. N. Murata, T. Takenouchi, T. Kanamori, S. Eguchi, 10.1162/089976604323057452, Neural Computation 16 (2004), 1437-1481. Zbl1102.68489DOI10.1162/089976604323057452
  47. R. T. Rockafellar, 10.2140/pjm.1968.24.525, Pacific J. Math. 24 (1968), 525-539. Zbl0324.90061MR0236689DOI10.2140/pjm.1968.24.525
  48. R. T. Rockafellar, Convex integral functionals and duality., In: Contributions to Nonlinear Functional Analysis (E. H. Zarantonello, ed.), Academic Press, New York 1971, pp. 215-236. Zbl0326.49008MR0390870
  49. R. T. Rockafellar, Convex Analysis., Princeton University Press, Princeton 1970. Zbl1011.49013MR0274683
  50. R. T. Rockafellar, R. J.-B. Wets, Variational Analysis., Springer Verlag, Berlin - Heidelberg - New York 2004. Zbl0888.49001MR1491362
  51. M. Teboulle, I. Vajda, 10.1109/18.179378, IEEE Trans. Inform. Theory 39 (1993), 297-301. Zbl0765.94001MR1211512DOI10.1109/18.179378
  52. F. Topsoe, Information-theoretical optimization techniques., Kybernetika 15 (1979), 8-27. MR0529888
  53. I. Vajda, Theory of Statistical Inference and Information., Kluwer Academic Puplishers, Dordrecht 1989. Zbl0711.62002

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.