On the Jensen-Shannon divergence and the variation distance for categorical probability distributions

Jukka Corander; Ulpu Remes; Timo Koski

Kybernetika (2021)

  • Volume: 57, Issue: 6, page 879-907
  • ISSN: 0023-5954

Abstract

top
We establish a decomposition of the Jensen-Shannon divergence into a linear combination of a scaled Jeffreys' divergence and a reversed Jensen-Shannon divergence. Upper and lower bounds for the Jensen-Shannon divergence are then found in terms of the squared (total) variation distance. The derivations rely upon the Pinsker inequality and the reverse Pinsker inequality. We use these bounds to prove the asymptotic equivalence of the maximum likelihood estimate and minimum Jensen-Shannon divergence estimate as well as the asymptotic consistency of the minimum Jensen-Shannon divergence estimate. These are key properties for likelihood-free simulator-based inference.

How to cite

top

Corander, Jukka, Remes, Ulpu, and Koski, Timo. "On the Jensen-Shannon divergence and the variation distance for categorical probability distributions." Kybernetika 57.6 (2021): 879-907. <http://eudml.org/doc/297892>.

@article{Corander2021,
abstract = {We establish a decomposition of the Jensen-Shannon divergence into a linear combination of a scaled Jeffreys' divergence and a reversed Jensen-Shannon divergence. Upper and lower bounds for the Jensen-Shannon divergence are then found in terms of the squared (total) variation distance. The derivations rely upon the Pinsker inequality and the reverse Pinsker inequality. We use these bounds to prove the asymptotic equivalence of the maximum likelihood estimate and minimum Jensen-Shannon divergence estimate as well as the asymptotic consistency of the minimum Jensen-Shannon divergence estimate. These are key properties for likelihood-free simulator-based inference.},
author = {Corander, Jukka, Remes, Ulpu, Koski, Timo},
journal = {Kybernetika},
keywords = {blended divergences; Chan-Darwiche metric; likelihood-free inference; implicit maximum likelihood; reverse Pinsker inequality; simulator-based inference},
language = {eng},
number = {6},
pages = {879-907},
publisher = {Institute of Information Theory and Automation AS CR},
title = {On the Jensen-Shannon divergence and the variation distance for categorical probability distributions},
url = {http://eudml.org/doc/297892},
volume = {57},
year = {2021},
}

TY - JOUR
AU - Corander, Jukka
AU - Remes, Ulpu
AU - Koski, Timo
TI - On the Jensen-Shannon divergence and the variation distance for categorical probability distributions
JO - Kybernetika
PY - 2021
PB - Institute of Information Theory and Automation AS CR
VL - 57
IS - 6
SP - 879
EP - 907
AB - We establish a decomposition of the Jensen-Shannon divergence into a linear combination of a scaled Jeffreys' divergence and a reversed Jensen-Shannon divergence. Upper and lower bounds for the Jensen-Shannon divergence are then found in terms of the squared (total) variation distance. The derivations rely upon the Pinsker inequality and the reverse Pinsker inequality. We use these bounds to prove the asymptotic equivalence of the maximum likelihood estimate and minimum Jensen-Shannon divergence estimate as well as the asymptotic consistency of the minimum Jensen-Shannon divergence estimate. These are key properties for likelihood-free simulator-based inference.
LA - eng
KW - blended divergences; Chan-Darwiche metric; likelihood-free inference; implicit maximum likelihood; reverse Pinsker inequality; simulator-based inference
UR - http://eudml.org/doc/297892
ER -

References

top
  1. Barnet, N. S., Dragomir, S., , In: Advances in Inequalities from Probability Theory and Statistics (N. S. Barnett and S. S. Dragomir, eds.), Nova Science Publishing, New York 2008, pp. 1-85. MR2459969DOI
  2. Basseville, M., , Signal Processing 93 (2013), 621-633. DOI
  3. Berend, D., Kontorovich, A., 10.1016/j.spl.2013.01.023, Stat. Probab. Lett. 83 (2013), 1254-259. MR3041401DOI10.1016/j.spl.2013.01.023
  4. Tutorial, BOLFI, Manual, https://elfi.readthedocs.io/en/latest/usage/BOLFI.html, 2017. 
  5. Böhm, U., Dahm, P. F., McAllister, B. F., Greenbaum, I. F., 10.1007/BF00225189, Human Genetics 95 (1995), 249-256. DOI10.1007/BF00225189
  6. Chan, H., Darwiche, A., , Int. J. Approx. Reasoning 38 (2005), 149-174. MR2116782DOI
  7. Chan, H., Darwiche, A., 10.1016/j.artint.2004.09.005, Artif. Intell. 163 (2005), 67-90. MR2120039DOI10.1016/j.artint.2004.09.005
  8. Charalambous, C. D., Tzortzis, I., Loyka, S., Charalambous, T., , IEEE Trans. Automat. Control 59 (2014), 2353-2368. MR3254531DOI
  9. Corander, J., Fraser, C., Gutmann, M. U., Arnold, B., Hanage, W. P., Bentley, S. D., Lipsitch, M., Croucher, N. J., , Nature Ecology Evolution 1 (2017), 1950-1960. DOI
  10. Cover, Th. M., Thomas, J. A., Elements of Information Theory. Second edition., John Wiley and Sons, New York 2012. MR2239987
  11. Cranmer, K., Brehmer, J., Louppe, G., , Proc. Natl. Acad. Sci. USA 117 (2020), 30055-30062. MR4263287DOI
  12. Csiszár, I., Talata, Z., , IEEE Trans. Inform. Theory 52 (2006), 1007-1016. MR2238067DOI
  13. Csiszár, I., Shields, P. C., Information Theory and Statistics: A tutorial., Now Publishers Inc, Delft 2004. 
  14. Devroye, L., 10.1214/aos/1176346255, Ann. Statist. 11 (1983), 896-904. MR0707939DOI10.1214/aos/1176346255
  15. Diggle, P. J., Gratton, R. J., Monte Carlo methods of inference for implicit statistical models., J. R. Stat. Soc. Ser. B. Stat. Methodol. 46, (1984), 193-212. MR0781880
  16. M.Endre, D., Schindelin, J. E., , IEEE Trans. Inform. Theory 49 (2003), 1858-1860. MR1985590DOI
  17. Fedotov, A. A., Harremoës, P., Topsøe, F., , IEEE Trans. Inform. Theory 49 (2003), 1491-1498. MR1984937DOI
  18. Gibbs, A. L., Su, F. E., , Int. Stat. Rev. 70 (2002), 419-435. DOI
  19. Guntuboyina, A., , IEEE Trans. Inform. Theory 57 (2011), 2386-2399. MR2809097DOI
  20. Gutmann, M. U., Corander, J., Bayesian optimization for likelihood-free inference of simulator-based statistical models., J. Mach. Learn. Res. 17, (2016), 4256-4302. MR3555016
  21. Gyllenberg, M., Koski, T., Reilink, E., Verlaan, M., , J. App. Prob. 31 (1994), 542-548. MR1274807DOI
  22. Gyllenberg, M., Koski, T., , J. Classification 13 (1996), 213-229. MR1421666DOI
  23. Holopainen, I., Evaluating Uncertainty with Jensen-Shannon Divergence., Master's Thesis, Faculty of Science, University of Helsinki 2021. 
  24. Hou, C-D., Chiang, J., Tai, J. J., , Biometrics 57 (2001), 435-440. MR1855677DOI
  25. Janžura, M., Boček, P., A method for knowledge integration., Kybernetika 34 (1998), 41-55. MR1619054
  26. Jardine, N., Sibson, R., Mathematical Taxonomy., J. Wiley and Sons, London 1971. MR0441395
  27. Khosravifard, M., Fooladivanda, D., Gulliver, T. A., Exceptionality of the variational distance., In: 2006 IEEE Information Theory Workshop-ITW'06 Chengdu 2006, pp. 274-276. 
  28. Koski, T., Probability Calculus for Data Science., Studentlitteratur, Lund 2020. 
  29. Kůs, V., Blended φ -divergences with examples., Kybernetika 39 (2003), 43-54. MR1980123
  30. Kůs, V., Morales, D., Vajda, I., , Kybernetika 44 (2008), 95-112. MR2405058DOI
  31. LeCam, L., , Ann. Math. Statist. 41 (1970), 802-828. MR0267676DOI
  32. Liese, F., Vajda, I., , IEEE Trans. Inform. Theory 52 (2006), 4394-4412. MR2300826DOI
  33. Li, K., Mitendra, J., Implicit maximum likelihood estimation., arXiv preprint arXiv:1809.09087, 2018). 
  34. Lin, J., , IEEE Trans. Inform. Theory 37 (1991), 145-151. MR1087893DOI
  35. Lintusaari, J., Gutmann, M. U, Dutta, R., Kaski, S., Corander, J., Fundamentals and recent developments in approximate Bayesian computation., Systematic Biology 66 (2017), e66-e82. 
  36. Lintusaari, J., Vuollekoski, H., Kangasrääsiö, A., Skytén, K., Järvenpää, M., Marttinen, P., Gutmann, M. U., Vehtari, A., Corander, J., Kaski, S., ELFI: Engine for likelihood-free inference., J. Mach. Learn. Res. 19 (2018), 1-7. MR3862423
  37. Morales, D., Pardo, L., Vajda, I., , J. Statist. Plann. Inference 48 (1995), 347-369. MR1368984DOI
  38. Nowozin, S., Cseke, B., Tomioka, R., f-gan: Training generative neural samplers using variational divergence minimization., Advances Neural Inform. Process. Systems (2016), 271-279. 
  39. Okamoto, M., , Ann. Inst.of Statist. Math. 10 (1959), 29-35. MR0099733DOI
  40. Sason, I., , Entropy 20 (2018), 383-405. MR3862573DOI
  41. Sason, I., Verdu, S., , IEEE Trans. Inform. Theory 62 (2016), 5973-6006. MR3565096DOI
  42. Shannon, M., Properties of f-divergences and f-GAN training., arXiv preprint arXiv:2009.00757, 2020. 
  43. Sibson, R., , Z. Wahrsch. Verw. Geb. 14 (1969), 149-160. MR0258198DOI
  44. Sinn, M., Rawat, A., Non-parametric estimation of Jensen-Shannon divergence in generative adversarial network training., In: International Conference on Artificial Intelligence and Statistics 2018, pp. 642-651. 
  45. Taneja, I. J., On mean divergence measures., In: Advances in Inequalities from Probability Theory and Statistics (N. S. Barnett and S. S. Dragomir, eds.), Nova Science Publishing, New York 2008, pp. 169-186. MR2459974
  46. Topsøe, F., Information-theoretical optimization techniques., Kybernetika 15 (1979), 8-27. MR0529888
  47. Topsøe, F., , IEEE Trans. Inform. Theory 46 (2000), 1602-1609. MR1768575DOI
  48. Vajda, I., , IEEE Trans. Inform. Theory 16 (1970), 771-773. MR0275575DOI
  49. Vajda, I., Theory of Statistical Inference and Information., Kluwer Academic Publ., Delft 1989. 
  50. Vajda, I., , Kybernetika 45 (2009), 885-900. MR2650071DOI
  51. Jr., J. I. Yellott, , J. Math. Psych. 15 (1977), 109-144. MR0449795DOI
  52. Österreicher, F., Vajda, I., , IEEE Trans. Inform. Theory 39 (1993), 1036-1039. MR1237725DOI

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.