Factorized mutual information maximization

Thomas Merkh; Guido F. Montúfar

Kybernetika (2020)

  • Volume: 56, Issue: 5, page 948-978
  • ISSN: 0023-5954

Abstract

top
We investigate the sets of joint probability distributions that maximize the average multi-information over a collection of margins. These functionals serve as proxies for maximizing the multi-information of a set of variables or the mutual information of two subsets of variables, at a lower computation and estimation complexity. We describe the maximizers and their relations to the maximizers of the multi-information and the mutual information.

How to cite

top

Merkh, Thomas, and Montúfar, Guido F.. "Factorized mutual information maximization." Kybernetika 56.5 (2020): 948-978. <http://eudml.org/doc/296942>.

@article{Merkh2020,
abstract = {We investigate the sets of joint probability distributions that maximize the average multi-information over a collection of margins. These functionals serve as proxies for maximizing the multi-information of a set of variables or the mutual information of two subsets of variables, at a lower computation and estimation complexity. We describe the maximizers and their relations to the maximizers of the multi-information and the mutual information.},
author = {Merkh, Thomas, Montúfar, Guido F.},
journal = {Kybernetika},
keywords = {multi-information; mutual information; divergence maximization; marginal specification problem; transportation polytope},
language = {eng},
number = {5},
pages = {948-978},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Factorized mutual information maximization},
url = {http://eudml.org/doc/296942},
volume = {56},
year = {2020},
}

TY - JOUR
AU - Merkh, Thomas
AU - Montúfar, Guido F.
TI - Factorized mutual information maximization
JO - Kybernetika
PY - 2020
PB - Institute of Information Theory and Automation AS CR
VL - 56
IS - 5
SP - 948
EP - 978
AB - We investigate the sets of joint probability distributions that maximize the average multi-information over a collection of margins. These functionals serve as proxies for maximizing the multi-information of a set of variables or the mutual information of two subsets of variables, at a lower computation and estimation complexity. We describe the maximizers and their relations to the maximizers of the multi-information and the mutual information.
LA - eng
KW - multi-information; mutual information; divergence maximization; marginal specification problem; transportation polytope
UR - http://eudml.org/doc/296942
ER -

References

top
  1. Alemi, A., Fischer, I., Dillon, J., Murphy, K., Deep variational information bottleneck., In: ICLR, 2017. 
  2. Ay, N., 10.1214/aop/1020107773, Ann. Probab. 30 (2002), 1, 416-436. Zbl1010.62007MR1894113DOI10.1214/aop/1020107773
  3. Ay, N., 10.1162/089976602760805368, Neural Comput. 14 (2002), 12, 2959-2980. Zbl1079.68582DOI10.1162/089976602760805368
  4. Ay, N., Bertschinger, N., Der, R., Güttler, F., Olbrich, E., 10.1140/epjb/e2008-00175-0, Europ. Phys. J. B 63 (2008), 3, 329-339. MR2421556DOI10.1140/epjb/e2008-00175-0
  5. Ay, N., Knauf, A., Maximizing multi-information., Kybernetika 42 (2006), 5, 517-538. Zbl1249.82011MR2283503
  6. Baldassarre, G., Mirolli, M., 10.1007/978-3-642-32375-1_1, In: Intrinsically motivated learning in natural and artificial systems, Springer 2013, pp. 1-14. DOI10.1007/978-3-642-32375-1_1
  7. Baudot, P., Tapia, M., Bennequin, D., Goaillard, J.-M., 10.3390/e21090869, Entropy 21 (2019), 9, 869. MR4016406DOI10.3390/e21090869
  8. Bekkerman, R., Sahami, M., Learned-Miller, E., 10.1007/11871842_8, In: European Conference on Machine Learning, Springer 2006, pp. 30-41. MR2336649DOI10.1007/11871842_8
  9. Belghazi, M. I., Baratin, A., Rajeshwar, S., Ozair, S., Bengio, Y., Courville, A., Hjelm, D., Mutual information neural estimation., In: Proc. 35th International Conference on Machine Learning (J. Dy and A. Krause, eds.), Vol. 80 of Proceedings of Machine Learning Research, pp. 531-540, Stockholm 2018. PMLR. 
  10. Bertschinger, N., Rauh, J., Olbrich, E., Jost, J., Ay, N., 10.3390/e16042161, Entropy 16 (2014), 4, 2161-2183. MR3195286DOI10.3390/e16042161
  11. Bialek, W., Nemenman, I., Tishby, N., 10.1162/089976601753195969, Neural Comput. 13 (2001), 11, 2409-2463. DOI10.1162/089976601753195969
  12. Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., Efros, A. A., Large-scale study of curiosity-driven learning., In: ICLR, 2019. 
  13. Buzzi, J., Zambotti, L., 10.1007/s00440-011-0350-y, Probab. Theory Related Fields 153 (2012), 3-4, 421-440. MR2948682DOI10.1007/s00440-011-0350-y
  14. Chentanez, N., Barto, A. G., Singh, S. P., 10.21236/ada440280, In: Adv. Neural Inform. Process. Systems 2005, pp. 1281-1288. DOI10.21236/ada440280
  15. Crutchfield, J. P., Feldman, D. P., 10.1142/s021952590100019x, Adv. Complex Systems 4 (2001), 02n03, 251-264. MR1873760DOI10.1142/s021952590100019x
  16. Loera, J. de,  DOI
  17. Friedman, N., Mosenzon, O., Slonim, N., Tishby, N., Multivariate information bottleneck., In: Proc. Seventeenth conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers Inc., 2001, pp. 152-161. 
  18. Gabrié, M., Manoel, A., Luneau, C., Barbier, j., Macris, N., Krzakala, F., Zdeborová, L., Entropy and mutual information in models of deep neural networks., In: Advances in Neural Information Processing Systems 31 (S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds.), Curran Associates, Inc. 2018, pp. 1821-1831. MR3841726
  19. Gao, S., Steeg, G. Ver, Galstyan, A., Efficient estimation of mutual information for strongly dependent variables., In: Artificial Intelligence and Statistics 2015, pp. 277-286. 
  20. Hjelm, R. D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Bachman, P., Trischler, A., Y. Bengio., Learning deep representations by mutual information, Representations, maximization. In International Conference on Learning. 2019. 
  21. Hosten, S., Sullivant, S., 10.1006/jcta.2002.3301, J. Comb. Theory Ser. A 100 (2002), 2, 277-301. MR1940337DOI10.1006/jcta.2002.3301
  22. Jakulin, A., Bratko, I., Quantifying and visualizing attribute interactions: An approach based on entropy., 2003. 
  23. Klyubin, A. S., Polani, D., Nehaniv, C. L., Empowerment: A universal agent-centric measure of control., In: 2005 IEEE Congress on Evolutionary Computation, Vol. 1, IEEE 2005, pp. 128-135. 
  24. Kraskov, A., Stögbauer, H./, Grassberger, P., 10.1103/physreve.69.066138, Phys. Rev. E 69 (2004), 6, 066138. MR2096503DOI10.1103/physreve.69.066138
  25. Matúš, F., Maximization of information divergences from binary i.i.d. sequences., In: Proc. IPMU 2004 2 (2004), pp. 1303-1306. 
  26. Matúš, F., 10.1109/tit.2009.2032806, IEEE Trans. Inf. Theor. 55 (2009), 12, 5375-5381. MR2597169DOI10.1109/tit.2009.2032806
  27. Matúš, F., Ay, N., On maximization of the information divergence from an exponential family., In: Proc. 6th Workshop on Uncertainty Processing: Oeconomica 2003, Hejnice 2003, pp. 199-204. 
  28. Matúš, F., Rauh, J., 10.1109/isit.2011.6034269, In: 2011 IEEE International Symposium on Information Theory Proceedings 2011, pp. 903-907. MR2817016DOI10.1109/isit.2011.6034269
  29. McGill, W., 10.1109/tit.1954.1057469, Trans. IRE Profess. Group Inform. Theory 4 (1054), 4, 93-111. MR0088155DOI10.1109/tit.1954.1057469
  30. Mohamed, S., Rezende, D. J., Variational information maximisation for intrinsically motivated reinforcement learning., In: Advances in Neural Information Processing Systems 2015, 2125-2133, 2015. 
  31. Montúfar, G., 10.1162/neco_a_00601, Neural Comput. 26 (2014), 7, 1386-1407. MR3222078DOI10.1162/neco_a_00601
  32. Montúfar, G., Ghazi-Zahedi, K., Ay, N., 10.1371/journal.pcbi.1004427, PLOS Comput. Biology 11 (2015), 9, 1-22. DOI10.1371/journal.pcbi.1004427
  33. Montúfar, G., Ghazi-Zahedi, K., Ay, N., Information theoretically aided reinforcement learning for embodied agents., arXiv preprint arXiv:1605.09735, 2016. 
  34. Montúfar, G., Rauh, J., Ay, N., Expressive power and approximation errors of restricted Boltzmann machines., In: Advances in Neural Information Processing Systems 2011, pp. 415-423. 
  35. Montúfar, G., Rauh, J., Ay, N., 10.1007/978-3-642-40020-9_85, In: Geometric Science of Information GSI 2013 (F. Nielsen and F. Barbaresco, eds.), Lecture Notes in Computer Science 3085 Springer 2013, pp. 759-766. MR3126126DOI10.1007/978-3-642-40020-9_85
  36. Rauh, J., 10.1109/tit.2011.2136230, IEEE Trans. Inform. Theory 57 (2011), 6, 3236-3247. MR2817016DOI10.1109/tit.2011.2136230
  37. Rauh, J., Finding the Maximizers of the Information Divergence from an Exponential Family., PhD. Thesis, Universität Leipzig 2011. MR2817016
  38. Ince, R. A. A., Quantities, S. Panzeri, Schultz, S. R., Summary of Information Theoretic, New York, pages 1-6, Springer, 2013. 
  39. Roulston, M. S., 10.1016/s0167-2789(98)00269-3, Physica D: Nonlinear Phenomena 125 (1999), 3-4, 285-294. DOI10.1016/s0167-2789(98)00269-3
  40. Schossau, J., Adami, C., Hintze, A., 10.3390/e18010006, Entropy 18 (2015), 1, 6. DOI10.3390/e18010006
  41. Slonim, N., Atwal, G. S., Tkacik, G., Bialek, W., Estimating mutual information and multi-information in large networks., arXiv preprint cs/0502017, 2005. 
  42. Slonim, N., Friedman, N., Tishby, N., 10.1162/neco.2006.18.8.1739, Neural Comput. 18 (2006), 8, 1739-1789. MR2230853DOI10.1162/neco.2006.18.8.1739
  43. Still, S., Precup, D., 10.1007/s12064-011-0142-z, Theory Biosci. 131 (2012), 3, 139-148. DOI10.1007/s12064-011-0142-z
  44. Developers, The Sage, SageMath, the Sage Mathematics Software System (Version 8.7), 2019., https://www.sagemath.org. 
  45. Tishby, N., Pereira, F. C., Bialek, W., The information bottleneck method., In: Proc. 37th Annual Allerton Conference on Communication, Control and Computing 1999, pp. 368-377. 
  46. Vergara, J. R., Estévez, P. A., 10.1007/s00521-013-1368-0, Neural Comput. Appl. 24 (2014), 1, 175-186. DOI10.1007/s00521-013-1368-0
  47. Watanabe, S., 10.1147/rd.41.0066, IBM J. Res. Develop. 4 (1960), 1, 66-82. MR0109755DOI10.1147/rd.41.0066
  48. Witsenhausen, H. S., Wyner, A. D., 10.1109/tit.1975.1055437, IEEE Trans. Inform. Theory 21 (1075), 5, 493-501. MR0381861DOI10.1109/tit.1975.1055437
  49. Yemelichev, V., Kovalev, M., Kravtsov, M., Polytopes, Graphs and Optimisation., Cambridge University Press, 1984. MR0744197
  50. Zahedi, K., Ay, N., Der, R., 10.1177/1059712310375314, Adaptive Behavior 18 (2010), 3-4, 338-355. DOI10.1177/1059712310375314
  51. Zahedi, K., Martius, G., Ay, N., 10.3389/fpsyg.2013.00801, Front. Psychol. (2013), 4, 801. DOI10.3389/fpsyg.2013.00801

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.