Article

Full entry | PDF   (0.6 MB)
Keywords:
total correlation; dual total correlation; transportation inequalities; mixtures of products
Summary:
Total correlation (TC') and dual total correlation (DTC') are two classical ways to quantify the correlation among an $n$-tuple of random variables. They both reduce to mutual information when $n=2$. The first part of this paper sets up the theory of TC and DTC for general random variables, not necessarily finite-valued. This generality has not been exposed in the literature before. The second part considers the structural implications when a joint distribution $\mu$ has small TC or DTC. If $\mathrm{TC}(\mu) = o(n)$, then $\mu$ is close to a product measure according to a suitable transportation metric: this follows directly from Marton's classical transportation-entropy inequality. If $\mathrm{DTC}(\mu) = o(n)$, then the structural consequence is more complicated: $\mu$ is a mixture of a controlled number of terms, most of them close to product measures in the transportation metric. This is the main new result of the paper.
References:
[1] Abdallah, S. A., Plumbley, M. D.: Predictive Information, Multi-Information and Binding Information. Technical Report C4DM-TR-10-10, Queen Mary University of London, 2010.
[2] Ahlswede, R.: An elementary proof of the strong converse theorem for the multiple-access channel. J. Combin. Inform. System Sci. 7 (1982), 3, 216-230. MR 0724363
[3] Ahlswede, R.: The rate-distortion region for multiple descriptions without excess rate. IEEE Trans. Inform. Theory 31 (1985), 6, 721-726. DOI 10.1109/tit.1985.1057102 | MR 0823593
[4] Austin, T.: The structure of low-complexity Gibbs measures on product spaces. Ann. Probab. 47 (2019), 6, 4002-4023. DOI 10.1214/19-aop1352 | MR 4038047
[5] Austin, T.: Measure concentration and the weak Pinsker property. Publ. Math. Inst. Hautes Etudes Sci. 128 (2018), 1-119. DOI 10.1007/s10240-018-0098-3 | MR 3905465
[6] Ay, N., Olbrich, E., Bertschinger, N., Jost, J.: A unifying framework for complexity measures of finite systems. Working Paper 06-08-028, Santa Fe Institute, 2006.
[7] Balister, P., Bollobás, B.: Projections, entropy and sumsets. Combinatorica 32 (2012), 2, 125-141. DOI 10.1007/s00493-012-2453-1 | MR 2927635
[8] Chatterjee, S., Dembo, A.: Nonlinear large deviations. Adv. Math. 299 (2016), 396-450. DOI 10.1016/j.aim.2016.05.017 | MR 3519474
[9] Chung, F. R. K., Graham, R. L., Frankl, P., Shearer, J. B.: Some intersection theorems for ordered sets and graphs. J. Combin. Theory Ser. A 43 (1986), 1, 23-37. DOI 10.1016/0097-3165(86)90019-1 | MR 0859293
[10] Coja-Oghlan, A., Krzakala, F., Perkins, W., Zdeborová, L.: Information-theoretic thresholds from the cavity method. Adv. Math. 333 (2018), 694-795. DOI 10.1016/j.aim.2018.05.029 | MR 3818090
[11] Coja-Oghlan, A., Perkins, W.: Bethe states of random factor graphs. Comm. Math. Phys. 366 (2019), 1, 173-201. DOI 10.1007/s00220-019-03387-7 | MR 3919446
[12] Cover, T. M., Thomas, J. A.: Elements of Information Theory. Second edition. Wiley-Interscience, John Wiley and Sons, Hoboken, NJ 2006. MR 2239987
[13] Crooks, G.: On Measures of Entropy and Information. Technical note.
[14] Csiszár, I.: Sanov property, generalized $I$-projection and a conditional limit theorem. Ann. Probab. 12 (1984), 3, 768-793. DOI 10.1214/aop/1176993227 | MR 0744233
[15] Csiszár, I., Narayan, P.: Secrecy capacities for multiple terminals. IEEE Trans. Inform. Theory 50 (2004), 12, 3047-3061. DOI 10.1109/tit.2004.838380 | MR 2103483
[16] Dembo, A., Zeitouni, O.: Large deviations techniques and applications. Second edition. Springer-Verlag, Stochastic Modelling and Applied Probability 38, Berlin 2010. DOI 10.1007/978-1-4612-5320-4 | MR 2571413
[17] Dobrušin, R. L.: A general formulation of Shannon's fundamental theorem in the theory of information. Dokl. Akad. Nauk SSSR 126 (1959), 474-477. MR 0107573
[18] Dougherty, R., Freiling, C., Zeger, K.: Networks, matroids, and non-Shannon information inequalities. IEEE Trans. Inform. Theory 53 (2007), 6, 1949-1969. DOI 10.1109/tit.2007.896862 | MR 2321860
[19] Dudley, R. M.: Real Analysis and Probability. Cambridge University Press, Cambridge Studies in Advanced Mathematics 74, Cambridge 2002. DOI 10.1017/cbo9780511755347 | MR 1932358 | Zbl 1023.60001
[20] Dueck, G.: The strong converse of the coding theorem for the multiple-access channel. J. Combin. Inform. System Sci. 6 (1981), 3, 187-196. MR 0652388
[21] Eldan, R.: Gaussian-width gradient complexity, reverse log-sobolev inequalities and nonlinear large deviations. Geometr. Funct. Anal. 28 (2018), 6, 1548-1596. DOI 10.1007/s00039-018-0461-z | MR 3881829
[22] Eldan, R., Gross, R.: Exponential random graphs behave like mixtures of stochastic block models. Preprint, available online at DOI  | MR 3861824
[23] Eldan, R., Gross, R.: Decomposition of mean-field Gibbs distributions into product measures. Electron. J. Probab. 23 (2018), 35, 24. DOI 10.1214/18-EJP159 | MR 3798245
[24] Ellis, D., Friedgut, E., Kindler, G., Yehudayoff, A.: Geometric stability via information theory. Discrete Anal. 10 (2016), 28 pp. DOI 10.19086/da.784 | MR 3555193
[25] Fritz, T., Chaves, R.: Entropic inequalities and marginal problems. IEEE Trans. Inform. Theory 59 (2013), 2, 803-817. DOI 10.1109/tit.2012.2222863 | MR 3015697
[26] Fujishige, S.: Polymatroidal dependence structure of a set of random variables. Inform. Control 39 (1978), 1, 55-72. DOI 10.1016/s0019-9958(78)91063-x | MR 0514262
[27] Gelfand, I., Kolmogorov, A., Yaglom, I.: On the general definition of the quantity of information. Doklady Akad. Nauk SSSR 111 (1956), 4, 745-748. MR 0084440
[28] Han, T. S.: Linear dependence structure of the entropy space. Inform. Control 29 (1975), 4, 337-368. DOI 10.1016/s0019-9958(75)80004-0 | MR 0453264
[29] Han, T. S.: Nonnegative entropy measures of multivariate symmetric correlations. Inform. Control 36 (1978), 2, 133-156. DOI 10.1016/s0019-9958(78)90275-9 | MR 0464499
[30] Kolmogorov, A. N.: On the shannon theory of information transmission in the case of continuous signals. IEEE Trans. Inform. Theory IT-2 (1956), 102-108. DOI 10.1016/s0019-9958(78)90275-9 | MR 0097987
[31] Ledoux, M.: On {T}alagrand's deviation inequalities for product measures. ESAIM Probab. Statist. 1 (1995/97), 63-87. DOI 10.1051/ps:1997103 | MR 1399224
[32] Madiman, M., Tetali, P.: Information inequalities for joint distributions, with interpretations and applications. IEEE Trans. Inform. Theory 56 (2010), 6, 2699-2713. DOI 10.1109/tit.2010.2046253 | MR 2683430
[33] Makarychev, K., Makarychev, Y., Romashchenko, A., Vereshchagin, N.: A new class of non-Shannon-type inequalities for entropies. Commun. Inf. Syst. 2 (2002), 2, 147-165. DOI 10.4310/cis.2002.v2.n2.a3 | MR 1958013
[34] Marton, K.: A simple proof of the blowing-up lemma. IEEE Trans. Inform. Theory 32 (1986), 3, 445-446. DOI 10.1109/tit.1986.1057176 | MR 0838213
[35] Marton, K.: Bounding {$\overline d$}-distance by informational divergence: a method to prove measure concentration. Ann. Probab. 24 (1996), 2, 857-866. DOI 10.1214/aop/1039639365 | MR 1404531
[36] Matúš, F.: Two constructions on limits of entropy functions. IEEE Trans. Inform. Theory 53 (2007), 1, 320-330. DOI 10.1109/tit.2006.887090 | MR 2292891
[37] McDiarmid, C.: On the method of bounded differences. In: Surveys in Combinatorics, Norwich 1989, London Math. Soc. Lecture Note Ser. 141, Cambridge Univ. Press, Cambridge 1989, pp. 148-188. DOI 10.1017/cbo9781107359949.008 | MR 1036755
[38] McGill, W. J.: Multivariate information transmission. Trans. I.R.E. PGIT-4 (1954), 93-111. DOI 10.1109/tit.1954.1057469 | MR 0088155
[39] Pearl, J., Paz, A.: Graphoids: a graph-based logic for reasoning about relevance relations. In: Advances in Artificial Intelligence - II (B. Du Boulay, D. Hogg, and L. Steels, eds.), North Holland, Amsterdam 1987, pp. 357-363.
[40] Perez, A.: Information theory with an abstract alphabet. Generalized forms of McMillan's limit theorem for the case of discrete and continuous times. Theory Probab. Appl. 4 (1959), 99-102. DOI 10.1137/1104007 | MR 0122613
[41] Perez, A.: $\epsilon$-admissible simplifications of the dependence structure of a set of random variables. Kybernetika 13 (1977), 6, 439-449. MR 0472224
[42] Pinsker, M. S.: Information and information Stability of Random Variables and Processes. Holden-Day, Inc., San Francisco 1964. MR 0213190
[43] Radhakrishnan, J.: Entropy and counting. In: Computational Mathematics, Modelling and Applications (IIT Kharagpur, Golden Jubilee Volume) (J. Mishra, ed.), Narosa Publishers, 2001, pp. 146-168.
[44] Schneidman, E., Still, S., Berry, M. J., Bialek, W.: Network information and connected correlations. Phys. Rev. Lett. 91 (2003), 238701. DOI 10.1103/physrevlett.91.238701
[45] Studený, M., Vejnarová, J.: The multiinformation function as a tool for measuring stochastic dependence. In: Proc. NATO Advanced Study Institute on Learning in Graphical Models, Kluwer Academic Publishers, Norwell 1998, pp. 261-297. DOI 10.1007/978-94-011-5014-9_10
[46] Timme, N., Alford, W., Flecker, B., Beggs, J. M.: Synergy, redundancy, and multivariate information measures: An experimentalist's perspective. J. Comput. Neurosci. 36 (2014), 2, 119-140. DOI 10.1007/s10827-013-0458-4 | MR 3176934
[47] Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Develop. 4 (1960), 66-82. DOI 10.1147/rd.41.0066 | MR 0109755
[48] Yan, J.: Nonlinear large deviations: beyond the hypercube. Preprint, available online at DOI  | MR 4108123
[49] Zhang, Z., Yeung, R. W.: On characterization of entropy function via information inequalities. IEEE Trans. Inform. Theory 44 (1998), 4, 1440-1452. DOI 10.1109/18.681320 | MR 1665794

Partner of