Tree and local computations in a cross–entropy minimization problem with marginal constraints

Kybernetika (2010)

• Volume: 46, Issue: 4, page 621-654
• ISSN: 0023-5954

top

Abstract

top
In probability theory, Bayesian statistics, artificial intelligence and database theory the minimum cross-entropy principle is often used to estimate a distribution with a given set $P$ of marginal distributions under the proportionality assumption with respect to a given “prior” distribution $q$. Such an estimation problem admits a solution if and only if there exists an extension of $P$ that is dominated by $q$. In this paper we consider the case that $q$ is not given explicitly, but is specified as the maximum-entropy extension of an auxiliary set $Q$ of distributions. There are three problems that naturally arise: (1) the existence of an extension of a distribution set (such as $P$ and $Q$), (2) the existence of an extension of $P$ that is dominated by the maximum entropy extension of $Q$, (3) the numeric computation of the minimum cross-entropy extension of $P$ with respect to the maximum entropy extension of $Q$. In the spirit of a divide-and-conquer approach, we prove that, for each of the three above-mentioned problems, the global solution can be easily obtained by combining the solutions to subproblems defined at node level of a suitable tree.

How to cite

top

Malvestuto, Francesco M.. "Tree and local computations in a cross–entropy minimization problem with marginal constraints." Kybernetika 46.4 (2010): 621-654. <http://eudml.org/doc/196453>.

@article{Malvestuto2010,
abstract = {In probability theory, Bayesian statistics, artificial intelligence and database theory the minimum cross-entropy principle is often used to estimate a distribution with a given set $P$ of marginal distributions under the proportionality assumption with respect to a given “prior” distribution $q$. Such an estimation problem admits a solution if and only if there exists an extension of $P$ that is dominated by $q$. In this paper we consider the case that $q$ is not given explicitly, but is specified as the maximum-entropy extension of an auxiliary set $Q$ of distributions. There are three problems that naturally arise: (1) the existence of an extension of a distribution set (such as $P$ and $Q$), (2) the existence of an extension of $P$ that is dominated by the maximum entropy extension of $Q$, (3) the numeric computation of the minimum cross-entropy extension of $P$ with respect to the maximum entropy extension of $Q$. In the spirit of a divide-and-conquer approach, we prove that, for each of the three above-mentioned problems, the global solution can be easily obtained by combining the solutions to subproblems defined at node level of a suitable tree.},
author = {Malvestuto, Francesco M.},
journal = {Kybernetika},
keywords = {cross-entropy; acyclic hypergraph; connection tree; junction tree; probabilistic database; relational database; cross-entropy; acyclic hypergraph; connection tree; junction tree; probabilistic database; relational database},
language = {eng},
number = {4},
pages = {621-654},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Tree and local computations in a cross–entropy minimization problem with marginal constraints},
url = {http://eudml.org/doc/196453},
volume = {46},
year = {2010},
}

TY - JOUR
AU - Malvestuto, Francesco M.
TI - Tree and local computations in a cross–entropy minimization problem with marginal constraints
JO - Kybernetika
PY - 2010
PB - Institute of Information Theory and Automation AS CR
VL - 46
IS - 4
SP - 621
EP - 654
AB - In probability theory, Bayesian statistics, artificial intelligence and database theory the minimum cross-entropy principle is often used to estimate a distribution with a given set $P$ of marginal distributions under the proportionality assumption with respect to a given “prior” distribution $q$. Such an estimation problem admits a solution if and only if there exists an extension of $P$ that is dominated by $q$. In this paper we consider the case that $q$ is not given explicitly, but is specified as the maximum-entropy extension of an auxiliary set $Q$ of distributions. There are three problems that naturally arise: (1) the existence of an extension of a distribution set (such as $P$ and $Q$), (2) the existence of an extension of $P$ that is dominated by the maximum entropy extension of $Q$, (3) the numeric computation of the minimum cross-entropy extension of $P$ with respect to the maximum entropy extension of $Q$. In the spirit of a divide-and-conquer approach, we prove that, for each of the three above-mentioned problems, the global solution can be easily obtained by combining the solutions to subproblems defined at node level of a suitable tree.
LA - eng
KW - cross-entropy; acyclic hypergraph; connection tree; junction tree; probabilistic database; relational database; cross-entropy; acyclic hypergraph; connection tree; junction tree; probabilistic database; relational database
UR - http://eudml.org/doc/196453
ER -

References

top
1. Asmussen, S., Edwards, D., 10.1093/biomet/70.3.567, Biometrika 70 (1983), 367–378. (1983) Zbl0549.62041MR0725370DOI10.1093/biomet/70.3.567
2. Bacharach, M., Biproportional Matrices and Input-Output Change, Cambridge University Press, Cambridge 1970. (1970) Zbl0195.49705MR0263409
3. Badsberg, J.-H., Malvestuto, F. M., 10.1016/S0167-9473(01)00013-5, Comput. Statist. Data Analysis 37 (2001), 297–322. (2001) Zbl1061.65500MR1856676DOI10.1016/S0167-9473(01)00013-5
4. Beeri, C., Vardi, M., On the Properties of Full Join Dependencies, Adv. Database Theory I, Plenum Press, New York 1981. (1981)
5. Beeri, C., Fagin, R., Maier, D., Yannakakis, M., 10.1145/2402.322389, J. Assoc. Comput. Mach. 30 (1983), 479–513. (1983) Zbl0624.68087MR0709830DOI10.1145/2402.322389
6. Berge, C., Hypergraphs, North-Holland, Amsterdam 1989. (1989) Zbl0674.05001MR1013569
7. Berge, C., Discrete Multivariate Analysis, MIT Press, Cambridge 1975. (1975)
8. Csiszár, I., 10.1214/aop/1176996454, Ann. Probab. 3 (1975), 146–158. (1975) MR0365798DOI10.1214/aop/1176996454
9. Csiszár, I., Maxent, mathematics, and information theory, In: Proc. Internat. Workshop on “Maximum entropy and Bayesian methods", 1995, pp. 35–50. (1995) MR1446714
10. Dall’Aglio, G., Kotz, K., Salinetti, G., Advances in Probability Distributions with Given Marginals, Kluwer Academic Pub., Dordrecht, Boston, London 1991. (1991) MR1215942
11. Darroch, J. N., Ratcliff, D., 10.1214/aoms/1177692379, Ann. Math. Statist. 43 (1972), 1470–1480. (1972) Zbl0251.62020MR0345337DOI10.1214/aoms/1177692379
12. Deming, W. E., Statistical Adjustment of Data, Dover Pub., New York 1943. (1943) Zbl0060.31504MR0009819
13. Deming, W. E., Stephan, F. F., 10.1214/aoms/1177731829, Ann. Math. Statist. 11 (1940), 427–444. (1940) Zbl0024.05502MR0003527DOI10.1214/aoms/1177731829
14. Endo, Y., Takemura, A. I., 10.1016/j.csda.2008.11.013, Comput. Statist. Data Analysis 53 (2009), 966–978. (2009) MR2657062DOI10.1016/j.csda.2008.11.013
15. Fienberg, S. E., 10.1214/aoms/1177696968, Ann. Math. Statist. 41 (1970), 907–917. (1970) Zbl0198.23401MR0266394DOI10.1214/aoms/1177696968
16. Fienberg, S. E., Meyer, M. M., Iterative proportional fitting, In: Encyclopedia of Statistical Sciences (S. Kotz, N. L. Johnson, and C. B. Read, eds.), 4, John Wiley and Sons, New York, pp.  275–279.
17. Haberman, S. J., Log-linear Models for Contingency Tables, University of Chicago Press, Chicago 1974. (1974)
18. Ireland, C. T., Kullback, S., 10.1093/biomet/55.1.179, Biometrika 55 (1968), 179–188. (1968) Zbl0155.26701MR0229329DOI10.1093/biomet/55.1.179
19. Jiroušek, R., Composition of probability measures on finite spaces, In: Proc. XIII Internat. Conf. Uncertainty in Artificial Intelligence 1997, pp. 274–281. (1997)
20. Jiroušek, R., Přeučil, S., 10.1016/0167-9473(93)E0055-9, Comput. Statist. Data Analysis 19 (1995), 177–189. (1995) DOI10.1016/0167-9473(93)E0055-9
21. Johnson, R. W., 10.1109/TIT.1979.1056113, IEEE Trans. Inform. Theory 25 (1979), 709–716. (1979) Zbl0422.60016MR0551270DOI10.1109/TIT.1979.1056113
22. Kellerer, H. G., 10.1007/BF00534912, Zeitschrift Wahrscheinlichkeitstheorie und Verw. Gebiete 3 (1964), 247–270. (1964) Zbl0126.34003MR0175158DOI10.1007/BF00534912
23. Kellerer, H. G., 10.1007/BF01360315, Math. Annalen 153 (1964), 168–198. (1964) Zbl0118.05003MR0161956DOI10.1007/BF01360315
24. Kern-Isberner, G., 10.1016/S0004-3702(97)00068-4, Artificial Intelligence 98 (1998), 169–208. (1998) Zbl0903.68181MR1614388DOI10.1016/S0004-3702(97)00068-4
25. Ku, H. H., Kullback, S., Interaction in multidimensional contingency tables: an information-theoretic approach, J. Res. Nat. Bur. Standards - Math. Sci. 72 B (1968), 159–199. (1968) Zbl0274.62036MR0258223
26. Lauritzen, S. L., Graphical Models, Oxford Science Pub., Clarendom Press, Oxford 1996. (1996) MR1419991
27. Lauritzen, S. L., Speed, M. P., Vijayan, K., 10.1017/S1446788700027300, J. Austral. Math. Soc. Ser. A 36 (1984), 12–29. (1984) Zbl0533.05046MR0719998DOI10.1017/S1446788700027300
28. Lauritzen, S. L., Spiegelhalter, D. J., Local computations with probabilities on graphical structures and their application to expert systems, J. Roy. Stat. Soc. Ser. B 50 (1988), 157–224. (1988) Zbl0684.68106MR0964177
29. Leimer, G., 10.1016/0012-365X(93)90510-Z, Discrete Math. 113 (1993), 99–123. (1993) Zbl0793.05128MR1212872DOI10.1016/0012-365X(93)90510-Z
30. Leontief, W. W., The Structure of American Economy 1919–1929, Oxford University Press, New York 1941. (1941)
31. Leontief, W. W., Strout, A., Multiregional input-output analysis, In: Structural Interdependence and Economic Development, 1963, pp. 119–169. (1963)
32. Madigan, D., Mosurski, K., 10.1093/biomet/77.2.315, Biometrika 77 (1990), 315–319. (1990) Zbl0731.62113MR1064803DOI10.1093/biomet/77.2.315
33. Madigan, D., Mosurski, K., Errata: An extension of the results of Asmussen and Edwards on collapsibility in contingency tables, Biometrika 86 (1999) 973. (1999) MR1741994
34. Maier, D., The Theory of Relational Databases, Computer Science Press, 1983. (http://web.cecs.pdx.edu/ maier/TheoryBook/TRD.html) (1983) Zbl0519.68082MR0691493
35. Maier, D., Ullman, J. D., 10.1016/0304-3975(84)90030-6, Theoret. Comput. Sci. 32 (1984), 185–199. (1984) Zbl0557.05054MR0761167DOI10.1016/0304-3975(84)90030-6
36. Malvestuto, F. M., Answering queries in categorical data bases, In: Proc. VI ACM Symp. Principles of Database Systems 1987, pp. 87–96. (1987)
37. Malvestuto, F. M., 10.1016/0012-365X(88)90178-1, Discrete Math. 69 (1988), 61–77. (1988) Zbl0637.60021MR0935028DOI10.1016/0012-365X(88)90178-1
38. Malvestuto, F. M., 10.1016/0167-9473(89)90046-7, Computat. Statist. Data Anal. 8 (1989), 299–311. (1989) Zbl0726.62012MR1028405DOI10.1016/0167-9473(89)90046-7
39. Malvestuto, F. M., 10.1007/BF00162528, Statist. Computing 6 (1996), 169–176. (1996) DOI10.1007/BF00162528
40. Malvestuto, F. M., 10.1023/A:1008979300007, Statist. Computing 11 (2001), 155–169. (2001) MR1837135DOI10.1023/A:1008979300007
41. Malvestuto, F. M., 10.1023/A:1014551721406, Ann. Math. Artificial Intelligence 35 (2002), 253–285. (2002) Zbl1001.68033MR1899954DOI10.1023/A:1014551721406
42. Malvestuto, F. M., 10.1016/j.disc.2009.01.003, Discrete Math. 309 (2009), 4287–4298. (2009) Zbl1211.05093MR2519164DOI10.1016/j.disc.2009.01.003
43. Malvestuto, F. M., Moscarini, M., 10.1006/jcss.1998.1570, J. Comput. System Sci. 56 (1998), 299–309. (1998) Zbl0913.68060MR1633981DOI10.1006/jcss.1998.1570
44. Malvestuto, F. M., Moscarini, M., 10.1016/S0304-3975(98)00128-5, Theoret. Comput. Sci. 237 (2000), 57–79. (2000) Zbl0939.68089MR1756201DOI10.1016/S0304-3975(98)00128-5
45. Malvestuto, F. M., Pourabbas, E., Customized answers to summary queries via aggregate views, In: Proc. XVI Intl. Conf. Scientific & Statistical Database Management 2004, pp. 193–202. (2004)
46. Malvestuto, F. M., Pourabbas, E., Local computation of answers to table queries on summary databases, In: Proc. XVII Intl. Conf. Scientific & Statistical Database Management 2005, pp. 263–272. (2005)
47. Matúš, F., Discrete marginal problem for complex measures, Kybernetika 24 (1988), 36–46. (1988) MR0936552
48. Matúš, F., On the maximum-entropy extensions of probability measures over undirected graphs, In: Proc. III Workshop Uncertainty Processing in Expert Systems 1994, pp. 181–198. (1994)
49. Matúš, F., Flusser, J., 10.1109/34.254058, IEEE Trans. Pattern Analysis and Machine Intelligence 15 (1993), 996–1006. (1993) DOI10.1109/34.254058
50. Purcell, N. J., Kish, L., 10.2307/2530340, Biometrics 35 (1979), 365–384. (1979) Zbl0419.62092MR0535774DOI10.2307/2530340
51. Purcell, N. J., Kish, L., 10.2307/1402400, Internat. Statist. Rev. 48 (1980), 3–18. (1980) Zbl0433.62080DOI10.2307/1402400
52. Rüschendorf, L., 10.1214/aos/1176324703, Ann. Statist. 23 (1995), 1160–1174. (1995) MR1353500DOI10.1214/aos/1176324703
53. Shore, J. E., Johnson, R. W., 10.1109/TIT.1981.1056373, IEEE Trans. Inform. Theory 27 (1981), 472–482. (1981) Zbl0459.94008MR0635526DOI10.1109/TIT.1981.1056373
54. Stephan, F. F., 10.1214/aoms/1177731604, Ann. Math. Statist. 13 (1942), 166–178. (1942) MR0006674DOI10.1214/aoms/1177731604
55. Stone, R., Brown, A., A Computable Model for Economic Growth: A Programme for Growth, No. 1, Chapman Hall, London 1962. (1962)
56. Tarjan, R. E., Yannakakis, M., 10.1137/0213035, SIAM J. Comput. 13 (1984), 566–579. (1984) MR0749707DOI10.1137/0213035
57. Vomlel, J., 10.3166/jancl.14.367-386, J. Appl. Non-Classical Logics 14 (2004), 365–386. (2004) Zbl1185.68699DOI10.3166/jancl.14.367-386
58. Vorob’ev, N. N., 10.1137/1107014, Theor. Prob. Appl. 7 (1962), 147–163. (1962) DOI10.1137/1107014
59. Vorob’ev, N. N., 10.1137/1108047, Theor. Prob. Appl. 8 (1963), 420–429. (1963) MR0169295DOI10.1137/1108047
60. Yannakakis, M., 10.1137/0602010, SIAM J. Algebraic Discrete Mathematics 2 (1981), 77–79. (1981) Zbl0496.68033MR0604513DOI10.1137/0602010
61. Yannakakis, M., Algorithms for acyclic database schemes, In: Proc. VII Internat. Conf. Very Large Data Bases 1981, pp. 82–94. (1981)

top

NotesEmbed?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.