Article

MSC: 52B05, 60C05, 62E17
Full entry | PDF   (1.9 MB)
Keywords:
mixture model; non-negative tensor rank; perfect code; marginal polytope
Summary:
We study the problem of finding the smallest $m$ such that every element of an exponential family can be written as a mixture of $m$ elements of another exponential family. We propose an approach based on coverings and packings of the face lattice of the corresponding convex support polytopes and results from coding theory. We show that $m=q^{N-1}$ is the smallest number for which any distribution of $N$ $q$-ary variables can be written as mixture of $m$ independent $q$-ary variables. Furthermore, we show that any distribution of $N$ binary variables is a mixture of $m = 2^{N-(k+1)}(1+ 1/(2^k-1))$ elements of the $k$-interaction exponential family.
References:
[1] Amari, S.: Information geometry on hierarchical decomposition of stochastic interactions. IEEE Trans. Inform. Theory 47 (1999), 1701-1711.
[2] Amari, S., Nagaoka, H.: Methods of information geometry, Vol. 191. Oxford University Press, 2000. Translations of mathematical monographs. MR 1800071
[3] Ay, N., Knauf, A.: Maximizing multi-information. Kybernetika 42 (2006), 517-538. MR 2283503 | Zbl 1249.82011
[4] Ay, N., Montúfar, G. F., Rauh, J.: Selection criteria for neuromanifolds of stochastic dynamics. In: Advances in Cognitive Neurodynamics (III). Springer, 2011.
[5] Bishop, C. M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, New York 2006. MR 2247587
[6] Bocci, C., Chiantini, L.: On the identifiability of binary segre products. J. Algebraic Geom. 5 (2011).
[7] Brown, L.: Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory. Institute of Mathematical Statistics, Hayworth 1986. MR 0882001 | Zbl 0685.62002
[8] Catalisano, M. V., Geramita, A. V., Gimigliano, A.: Secant varieties of $\P^1\times\dots\times\P^1$ ($n$-times) are not defective for $n\geq5$. J. Algebraic Geom. 20 (2011), 295-327. DOI 10.1090/S1056-3911-10-00537-0 | MR 2762993
[9] Diaconis, P.: Finite forms of de Finetti's theorem on exchangeability. Synthese 36 (1977), 271-281. DOI 10.1007/BF00486116 | MR 0517222 | Zbl 0397.60005
[10] Efron, B.: The geometry of exponential families. Ann. Statist. 6 (1978), 2, 362-376. DOI 10.1214/aos/1176344130 | MR 0471152 | Zbl 0436.62027
[11] Cohen, S. L. G., Honkala, I., Lobstein, A.: Covering Codes. Elsevier, 1997. Zbl 0874.94001
[12] Gale, D.: Neighborly and cyclic polytopes. In: Convexity: Proc. Seventh Symposium in Pure Mathematics of the American Mathematical Society 1961, pp. 225-233. MR 0152944 | Zbl 0137.41801
[13] Gawrilow, E., Joswig, M.: Polymake: a framework for analyzing convex polytopes. In: Polytopes - Combinatorics and Computation (G. Kalai and G. M. Ziegler, eds.), Birkhäuser 2000, pp. 43-74. MR 1785292 | Zbl 0960.68182
[14] Geiger, D., Meek, C., Sturmfels, B.: On the toric algebra of graphical models. Ann. Statist. 34 (2006), 1463-1492. DOI 10.1214/009053606000000263 | MR 2278364 | Zbl 1104.60007
[15] Gilbert, E.: A comparison of signalling alphabets. Bell System Techn. J. 31 (1052), 504-522.
[16] Gilula, Z.: Singular value decomposition of probability matrices: Probabilistic aspects of latent dichotomous variables. Biometrika 66 (1979), 2, 339-344. DOI 10.1093/biomet/66.2.339 | MR 0548203 | Zbl 0411.62003
[17] Grünbaum, B.: Convex Polytopes. Second edition. Springer-Verlag, New York 2003. MR 1976856
[18] Henk, M., Richter-Gebert, J., Ziegler, G. M.: Basic Properties of Convex Polytopes. CRC Press, Boca Raton 1997. MR 1730169 | Zbl 0911.52007
[19] Hoşten, S., Sullivant, S.: Gröbner bases and polyhedral geometry of reducible and cyclic models. J. Combin. Theory Ser. A 100 (2002), 2, 277-301. DOI 10.1006/jcta.2002.3301 | MR 1940337 | Zbl 1044.62065
[20] Kahle, T.: Neighborliness of marginal polytopes. Contrib. Algebra Geometry 51 (2010), 45-56. MR 2650476 | Zbl 1238.60012
[21] Kahle, T., Ay, N.: Support sets of distributions with given interaction structure. In: Proc. WUPES'06, 2006.
[22] Kahle, T., Wenzel, W., Ay, N.: Hierarchical models, marginal polytopes, and linear codes. Kybernetika 45 (2009), 189-208. MR 2518148 | Zbl 1167.94340
[23] Kalai, G.: Some aspects of the combinatorial theory of convex polytopes. 1993. MR 1322063 | Zbl 0804.52006
[24] Kingman, J. F. C.: Uses of exchangeability. Ann. Probab. 6 (1978), 2, 183-197. DOI 10.1214/aop/1176995566 | MR 0494344 | Zbl 0374.60064
[25] Lindsay, B. G.: Mixture models: theory, geometry, and applications. NSF-CBMS Regional Conference Series in Probability and Statistics. Institute of Mathematical Statistics, 1995. Zbl 1163.62326
[26] McLachlan, G., Peel, D.: Finite Mixture Models. Wiley Series in Probability and Statistics: Applied Probability and Statistics. Wiley, 2000. MR 1789474 | Zbl 0963.62061
[27] Montúfar, G. F., Ay, N.: Refinements of universal approximation results for deep belief networks and restricted Boltzmann machines. Neural Comput. 23 (2011), 5, 1306-1319. DOI 10.1162/NECO_a_00113 | MR 2814846
[28] Montúfar, G. F., Rauh, J., Ay, N.: Expressive power and approximation errors of restricted Boltzmann machines. In: Advances in Neural Information Processing Systems 24 (J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Weinberger, eds.), MIT Press, 2011, pp. 415-423.
[29] Rauh, J.: Finding the Maximizers of the Information Divergence from an Exponential Family. Ph. D. Thesis, Universität Leipzig, 2011. MR 2817016
[30] Rauh, J., Kahle, T., Ay, N.: Support sets of exponential families and oriented matroids. Internat. J. Approximate Reasoning 52 (2011), 5, 613-626. DOI 10.1016/j.ijar.2011.01.013 | MR 2787021
[31] Settimi, R., Smith, J. Q.: On the geometry of Bayesian graphical models with hidden variables. In: Proc. Fourteenth conference on Uncertainty in artificial intelligence, UAI'98, Morgan Kaufmann Publishers 1998, pp. 472-479.
[32] Shemer., I.: Neighborly polytopes. Israel J. Math. 43 (1982), 291-311. DOI 10.1007/BF02761235 | MR 0693351 | Zbl 1223.52005
[33] Titterington, D., Smith, A. F. M., Makov, U. E.: Statistical Analysis of Finite Mixture Distributions. John Wiley and Sons, 1985. MR 0838090 | Zbl 0646.62013
[34] Varshamov, R.: Estimate of the number of signals in error correcting codes. Dokl. Akad. Nauk SSSR 117 (1957), 739-741.

Partner of