Title:
|
Factorized mutual information maximization (English) |
Author:
|
Merkh, Thomas |
Author:
|
Montúfar, Guido |
Language:
|
English |
Journal:
|
Kybernetika |
ISSN:
|
0023-5954 (print) |
ISSN:
|
1805-949X (online) |
Volume:
|
56 |
Issue:
|
5 |
Year:
|
2020 |
Pages:
|
948-978 |
Summary lang:
|
English |
. |
Category:
|
math |
. |
Summary:
|
We investigate the sets of joint probability distributions that maximize the average multi-information over a collection of margins. These functionals serve as proxies for maximizing the multi-information of a set of variables or the mutual information of two subsets of variables, at a lower computation and estimation complexity. We describe the maximizers and their relations to the maximizers of the multi-information and the mutual information. (English) |
Keyword:
|
multi-information |
Keyword:
|
mutual information |
Keyword:
|
divergence maximization |
Keyword:
|
marginal specification problem |
Keyword:
|
transportation polytope |
MSC:
|
62B10 |
MSC:
|
94A17 |
idMR:
|
MR4187782 |
DOI:
|
10.14736/kyb-2020-5-0948 |
. |
Date available:
|
2020-12-16T16:04:23Z |
Last updated:
|
2021-02-23 |
Stable URL:
|
http://hdl.handle.net/10338.dmlcz/148493 |
. |
Reference:
|
[1] Alemi, A., Fischer, I., Dillon, J., Murphy, K.: Deep variational information bottleneck..In: ICLR, 2017. |
Reference:
|
[2] Ay, N.: An information-geometric approach to a theory of pragmatic structuring..Ann. Probab. 30 (2002), 1, 416-436. Zbl 1010.62007, MR 1894113, 10.1214/aop/1020107773 |
Reference:
|
[3] Ay, N.: Locality of global stochastic interaction in directed acyclic networks..Neural Comput. 14 (2002), 12, 2959-2980. Zbl 1079.68582, 10.1162/089976602760805368 |
Reference:
|
[4] Ay, N., Bertschinger, N., Der, R., Güttler, F., Olbrich, E.: Predictive information and explorative behavior of autonomous robots..Europ. Phys. J. B 63 (2008), 3, 329-339. MR 2421556, 10.1140/epjb/e2008-00175-0 |
Reference:
|
[5] Ay, N., Knauf, A.: Maximizing multi-information..Kybernetika 42 (2006), 5, 517-538. Zbl 1249.82011, MR 2283503 |
Reference:
|
[6] Baldassarre, G., Mirolli, M.: Intrinsically motivated learning systems: an overview..In: Intrinsically motivated learning in natural and artificial systems, Springer 2013, pp. 1-14. 10.1007/978-3-642-32375-1_1 |
Reference:
|
[7] Baudot, P., Tapia, M., Bennequin, D., Goaillard, J.-M.: Topological information data analysis..Entropy 21 (2019), 9, 869. MR 4016406, 10.3390/e21090869 |
Reference:
|
[8] Bekkerman, R., Sahami, M., Learned-Miller, E.: Combinatorial markov random fields..In: European Conference on Machine Learning, Springer 2006, pp. 30-41. MR 2336649, 10.1007/11871842_8 |
Reference:
|
[9] Belghazi, M. I., Baratin, A., Rajeshwar, S., Ozair, S., Bengio, Y., Courville, A., Hjelm, D.: Mutual information neural estimation..In: Proc. 35th International Conference on Machine Learning (J. Dy and A. Krause, eds.), Vol. 80 of Proceedings of Machine Learning Research, pp. 531-540, Stockholm 2018. PMLR. |
Reference:
|
[10] Bertschinger, N., Rauh, J., Olbrich, E., Jost, J., Ay, N.: Quantifying unique information..Entropy 16 (2014), 4, 2161-2183. MR 3195286, 10.3390/e16042161 |
Reference:
|
[11] Bialek, W., Nemenman, I., Tishby, N.: Predictability, complexity, and learning..Neural Comput. 13 (2001), 11, 2409-2463. 10.1162/089976601753195969 |
Reference:
|
[12] Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., Efros, A. A.: Large-scale study of curiosity-driven learning..In: ICLR, 2019. |
Reference:
|
[13] Buzzi, J., Zambotti, L.: Approximate maximizers of intricacy functionals..Probab. Theory Related Fields 153 (2012), 3-4, 421-440. MR 2948682, 10.1007/s00440-011-0350-y |
Reference:
|
[14] Chentanez, N., Barto, A. G., Singh, S. P.: Intrinsically motivated reinforcement learning..In: Adv. Neural Inform. Process. Systems 2005, pp. 1281-1288. 10.21236/ada440280 |
Reference:
|
[15] Crutchfield, J. P., Feldman, D. P.: Synchronizing to the environment: Information-theoretic constraints on agent learning..Adv. Complex Systems 4 (2001), 02n03, 251-264. MR 1873760, 10.1142/s021952590100019x |
Reference:
|
[16] Loera, J. de: Transportation polytopes.. |
Reference:
|
[17] Friedman, N., Mosenzon, O., Slonim, N., Tishby, N.: Multivariate information bottleneck..In: Proc. Seventeenth conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers Inc., 2001, pp. 152-161. |
Reference:
|
[18] Gabrié, M., Manoel, A., Luneau, C., Barbier, j., Macris, N., Krzakala, F., Zdeborová, L.: Entropy and mutual information in models of deep neural networks..In: Advances in Neural Information Processing Systems 31 (S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds.), Curran Associates, Inc. 2018, pp. 1821-1831. MR 3841726 |
Reference:
|
[19] Gao, S., Steeg, G. Ver, Galstyan, A.: Efficient estimation of mutual information for strongly dependent variables..In: Artificial Intelligence and Statistics 2015, pp. 277-286. |
Reference:
|
[20] Hjelm, R. D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Bachman, P., Trischler, A., Y. Bengio.: Learning deep representations by mutual information.Representations, maximization. In International Conference on Learning. 2019. |
Reference:
|
[21] Hosten, S., Sullivant, S.: Gröbner bases and polyhedral geometry of reducible and cyclic models..J. Comb. Theory Ser. A 100 (2002), 2, 277-301. MR 1940337, 10.1006/jcta.2002.3301 |
Reference:
|
[22] Jakulin, A., Bratko, I.: Quantifying and visualizing attribute interactions: An approach based on entropy..2003. |
Reference:
|
[23] Klyubin, A. S., Polani, D., Nehaniv, C. L.: Empowerment: A universal agent-centric measure of control..In: 2005 IEEE Congress on Evolutionary Computation, Vol. 1, IEEE 2005, pp. 128-135. |
Reference:
|
[24] Kraskov, A., Stögbauer, H./, Grassberger, P.: Estimating mutual information..Phys. Rev. E 69 (2004), 6, 066138. MR 2096503, 10.1103/physreve.69.066138 |
Reference:
|
[25] Matúš, F.: Maximization of information divergences from binary i.i.d. sequences..In: Proc. IPMU 2004 2 (2004), pp. 1303-1306. |
Reference:
|
[26] Matúš, F.: Divergence from factorizable distributions and matroid representations by partitions..IEEE Trans. Inf. Theor. 55 (2009), 12, 5375-5381. MR 2597169, 10.1109/tit.2009.2032806 |
Reference:
|
[27] Matúš, F., Ay, N.: On maximization of the information divergence from an exponential family..In: Proc. 6th Workshop on Uncertainty Processing: Oeconomica 2003, Hejnice 2003, pp. 199-204. |
Reference:
|
[28] Matúš, F., Rauh, J.: Maximization of the information divergence from an exponential family and criticality..In: 2011 IEEE International Symposium on Information Theory Proceedings 2011, pp. 903-907. MR 2817016, 10.1109/isit.2011.6034269 |
Reference:
|
[29] McGill, W.: Multivariate information transmission..Trans. IRE Profess. Group Inform. Theory 4 (1054), 4, 93-111. MR 0088155, 10.1109/tit.1954.1057469 |
Reference:
|
[30] Mohamed, S., Rezende, D. J.: Variational information maximisation for intrinsically motivated reinforcement learning..In: Advances in Neural Information Processing Systems 2015, 2125-2133, 2015. |
Reference:
|
[31] Montúfar, G.: Universal approximation depth and errors of narrow belief networks with discrete units..Neural Comput. 26 (2014), 7, 1386-1407. MR 3222078, 10.1162/neco\_a\_00601 |
Reference:
|
[32] Montúfar, G., Ghazi-Zahedi, K., Ay, N.: A theory of cheap control in embodied systems..PLOS Comput. Biology 11 (2015), 9, 1-22. 10.1371/journal.pcbi.1004427 |
Reference:
|
[33] Montúfar, G., Ghazi-Zahedi, K., Ay, N.: Information theoretically aided reinforcement learning for embodied agents..arXiv preprint arXiv:1605.09735, 2016. |
Reference:
|
[34] Montúfar, G., Rauh, J., Ay, N.: Expressive power and approximation errors of restricted Boltzmann machines..In: Advances in Neural Information Processing Systems 2011, pp. 415-423. |
Reference:
|
[35] Montúfar, G., Rauh, J., Ay, N.: Maximal information divergence from statistical models defined by neural networks..In: Geometric Science of Information GSI 2013 (F. Nielsen and F. Barbaresco, eds.), Lecture Notes in Computer Science 3085 Springer 2013, pp. 759-766. MR 3126126, 10.1007/978-3-642-40020-9_85 |
Reference:
|
[36] Rauh, J.: Finding the maximizers of the information divergence from an exponential family..IEEE Trans. Inform. Theory 57 (2011), 6, 3236-3247. MR 2817016, 10.1109/tit.2011.2136230 |
Reference:
|
[37] Rauh, J.: Finding the Maximizers of the Information Divergence from an Exponential Family..PhD. Thesis, Universität Leipzig 2011. MR 2817016 |
Reference:
|
[38] Ince, R. A. A., Quantities, S. Panzeri, Schultz, S. R.: Summary of Information Theoretic.New York, pages 1-6, Springer, 2013. |
Reference:
|
[39] Roulston, M. S.: Estimating the errors on measured entropy and mutual information..Physica D: Nonlinear Phenomena 125 (1999), 3-4, 285-294. 10.1016/s0167-2789(98)00269-3 |
Reference:
|
[40] Schossau, J., Adami, C., Hintze, A.: Information-theoretic neuro-correlates boost evolution of cognitive systems..Entropy 18 (2015), 1, 6. 10.3390/e18010006 |
Reference:
|
[41] Slonim, N., Atwal, G. S., Tkacik, G., Bialek, W.: Estimating mutual information and multi-information in large networks..arXiv preprint cs/0502017, 2005. |
Reference:
|
[42] Slonim, N., Friedman, N., Tishby, N.: Multivariate information bottleneck..Neural Comput. 18 (2006), 8, 1739-1789. MR 2230853, 10.1162/neco.2006.18.8.1739 |
Reference:
|
[43] Still, S., Precup, D.: An information-theoretic approach to curiosity-driven reinforcement learning..Theory Biosci. 131 (2012), 3, 139-148. 10.1007/s12064-011-0142-z |
Reference:
|
[44] Developers, The Sage: SageMath, the Sage Mathematics Software System (Version 8.7), 2019..https://www.sagemath.org. |
Reference:
|
[45] Tishby, N., Pereira, F. C., Bialek, W.: The information bottleneck method..In: Proc. 37th Annual Allerton Conference on Communication, Control and Computing 1999, pp. 368-377. |
Reference:
|
[46] Vergara, J. R., Estévez, P. A.: A review of feature selection methods based on mutual information..Neural Comput. Appl. 24 (2014), 1, 175-186. 10.1007/s00521-013-1368-0 |
Reference:
|
[47] Watanabe, S.: Information theoretical analysis of multivariate correlation..IBM J. Res. Develop. 4 (1960), 1, 66-82. MR 0109755, 10.1147/rd.41.0066 |
Reference:
|
[48] Witsenhausen, H. S., Wyner, A. D.: A conditional entropy bound for a pair of discrete random variables..IEEE Trans. Inform. Theory 21 (1075), 5, 493-501. MR 0381861, 10.1109/tit.1975.1055437 |
Reference:
|
[49] Yemelichev, V., Kovalev, M., Kravtsov, M.: Polytopes, Graphs and Optimisation..Cambridge University Press, 1984. MR 0744197 |
Reference:
|
[50] Zahedi, K., Ay, N., Der, R.: Higher coordination with less control: A result of information maximization in the sensorimotor loop..Adaptive Behavior 18 (2010), 3-4, 338-355. 10.1177/1059712310375314 |
Reference:
|
[51] Zahedi, K., Martius, G., Ay, N.: Linear combination of one-step predictive information with an external reward in an episodic policy gradient setting: a critical analysis..Front. Psychol. (2013), 4, 801. 10.3389/fpsyg.2013.00801 |
. |