Previous |  Up |  Next

Article

Title: Generalized minimizers of convex integral functionals, Bregman distance, Pythagorean identities (English)
Author: Csiszár, Imre
Author: Matúš, František
Language: English
Journal: Kybernetika
ISSN: 0023-5954
Volume: 48
Issue: 4
Year: 2012
Pages: 637-689
Summary lang: English
.
Category: math
.
Summary: Integral functionals based on convex normal integrands are minimized subject to finitely many moment constraints. The integrands are finite on the positive and infinite on the negative numbers, strictly convex but not necessarily differentiable. The minimization is viewed as a primal problem and studied together with a dual one in the framework of convex duality. The effective domain of the value function is described by a conic core, a modification of the earlier concept of convex core. Minimizers and generalized minimizers are explicitly constructed from solutions of modified dual problems, not assuming the primal constraint qualification. A generalized Pythagorean identity is presented using Bregman distance and a correction term for lack of essential smoothness in integrands. Results are applied to minimization of Bregman distances. Existence of a generalized dual solution is established whenever the dual value is finite, assuming the dual constraint qualification. Examples of ‘irregular’ situations are included, pointing to the limitations of generality of certain key results. (English)
Keyword: maximum entropy
Keyword: moment constraint
Keyword: generalized primal/dual solutions
Keyword: normal integrand
Keyword: minimizing sequence
Keyword: convex duality
Keyword: Bregman projection
Keyword: conic core
Keyword: generalized exponential family
Keyword: inference principles
MSC: 49J53
MSC: 49K30
MSC: 62B10
MSC: 65K10
MSC: 90C46
MSC: 94A17
idMR: MR3013394
.
Date available: 2012-11-10T22:00:42Z
Last updated: 2013-09-24
Stable URL: http://hdl.handle.net/10338.dmlcz/143055
.
Reference: [1] S. M. Ali, S. D. Silvey: A general class of coefficients of divergence of one distribution from another..J. Roy. Statist. Soc. Ser. B 28 (1966) 131-142. Zbl 0203.19902, MR 0196777
Reference: [2] S. Amari, H. Nagaoka: Methods of Information Geometry..Transl. Math. Monographs 191, Oxford Univ. Press, 2000. Zbl 1146.62001, MR 1800071
Reference: [3] S. Amari, A. Cichocki: Information geometry of divergence functions..Bull. Polish Acad. Sci. 58 (2010) 183-194.
Reference: [4] O. Barndorff-Nielsen: Information and Exponential Families in Statistical Theory..Wiley, 1978. Zbl 0387.62011, MR 0489333
Reference: [5] H. H. Bauschke, J. M. Borwein: Legendre functions and the method of random Bregman projections..J. Convex Anal. 4 (1997), 27-67. Zbl 0894.49019, MR 1459881
Reference: [6] H. H. Bauschke, J. M. Borwein, P. L. Combettes: Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces..Comm. Contemp. Math. 3 (2001), 615-647. Zbl 1032.49025, MR 1869107, 10.1142/S0219199701000524
Reference: [7] A. Ben-Tal, A. Charnes: A dual optimization framework for some problems of information theory and statistics..Problems Control Inform. Theory 8 (1979), 387-401. Zbl 0437.90078, MR 0553884
Reference: [8] J. M. Borwein, A. S. Lewis: Duality relationships for entropy-like minimization problems..SIAM J. Control Optim. 29 (1991), 325-338. Zbl 0797.49030, MR 1092730, 10.1137/0329017
Reference: [9] J. M. Borwein, A. S. Lewis: Convergence of best entropy estimates..SIAM J. Optim. 1 (1991), 191-205. Zbl 0756.41037, MR 1098426, 10.1137/0801014
Reference: [10] J. M. Borwein, A. S. Lewis: Partially-finite programming in $L_1$ and the existence of maximum entropy estimates..SIAM J. Optim. 3 (1993), 248-267. MR 1215444, 10.1137/0803012
Reference: [11] J. M. Borwein, A. S. Lewis, D. Noll: Maximum entropy spectral analysis using derivative information. Part I: Fisher information and convex duality..Math. Oper. Res. 21 (1996), 442-468. MR 1397223, 10.1287/moor.21.2.442
Reference: [12] L. M. Bregman: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming..USSR Comput. Math. and Math. Phys. 7 (1967), 200-217. Zbl 0186.23807, MR 0215617, 10.1016/0041-5553(67)90040-7
Reference: [13] M. Broniatowski, A. Keziou: Minimization of $\phi$-divergences on sets of signed measures..Studia Sci. Math. Hungar. 43 (2006), 403-442. Zbl 1121.28004, MR 2273419
Reference: [14] J. P. Burg: Maximum entropy spectral analysis..Paper presented at 37th Meeting of Soc. Explor. Geophysicists, Oklahoma City 1967.
Reference: [15] J. P. Burg: Maximum entropy spectral analysis..Ph.D. Thesis, Dept. Geophysics, Stanford Univ., Stanford 1975.
Reference: [16] Y. Censor, S. A. Zenios: Parallel Optimization..Oxford University Press, New York 1997. Zbl 0945.90064, MR 1486040
Reference: [17] N. N. Chentsov: Statistical Decision Rules and Optimal Inference..Transl. Math. Monographs 53, American Math. Soc., Providence 1982. Russian original: Nauka, Moscow 1972. Zbl 0484.62008, MR 0645898
Reference: [18] I. Csiszár: Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten..Publ. Math. Inst. Hungar. Acad. Sci. 8 (1963), 85-108. Zbl 0124.08703, MR 0164374
Reference: [19] I. Csiszár: Information-type measures of difference of probability distributions and indirect observations..Studia Sci. Math. Hungar. 2 (1967), 299-318. Zbl 0157.25802, MR 0219345
Reference: [20] I. Csiszár: $I$-divergence geometry of probability distributions and minimization problems..Ann. Probab. 3 (1975), 146-158. MR 0365798, 10.1214/aop/1176996454
Reference: [21] I. Csiszár: Sanov property, generalized $I$-projection and a conditional limit theorem..Ann. Probab. 12 (1984), 768-793. Zbl 0544.60011, MR 0744233, 10.1214/aop/1176993227
Reference: [22] I. Csiszár: Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems..Ann. Statist. 19 (1991), 2031-2066. Zbl 0753.62003, MR 1135163, 10.1214/aos/1176348385
Reference: [23] I. Csiszár: Generalized projections for non-negative functions..Acta Math. Hungar. 68 (1995), 1-2, 161-185. Zbl 0837.62006, MR 1320794, 10.1007/BF01874442
Reference: [24] I. Csiszár, F. Gamboa, E. Gassiat: MEM pixel correlated solutions for generalized moment and interpolation problems..IEEE Trans. Inform. Theory 45 (1999), 2253-2270. Zbl 0958.94002, MR 1725114, 10.1109/18.796367
Reference: [25] I. Csiszár, F. Matúš: Convex cores of measures on $\mathcal{R}^d$..Studia Sci. Math. Hungar. 38 (2001), 177-190. MR 1877777
Reference: [26] I. Csiszár, F. Matúš: Information projections revisited..IEEE Trans. Inform. Theory 49 (2003), 1474-1490. Zbl 1063.94016, MR 1984936, 10.1109/TIT.2003.810633
Reference: [27] I. Csiszár, F. Matúš: Generalized maximum likelihood estimates for infinite dimensional exponential families..In: Proc. Prague Stochastics'06, Prague 2006, pp. 288-297.
Reference: [28] I. Csiszár, F. Matúš: Generalized maximum likelihood estimates for exponential families..Probab. Theory Related Fields 141 (2008), 213-246. Zbl 1133.62039, MR 2372970, 10.1007/s00440-007-0084-z
Reference: [29] I. Csiszár, F. Matúš: On minimization of entropy functionals under moment constraints..In: Proc. ISIT 2008, Toronto, pp. 2101-2105.
Reference: [30] I. Csiszár, F. Matúš: On minimization of multivariate entropy functionals..In: Proc. ITW 2009, Volos, Greece, pp. 96-100.
Reference: [31] I. Csiszár, F. Matúš: Minimization of entropy functionals revisited..In: Proc. ISIT 2012, Cambridge, MA, pp. 150-154.
Reference: [32] D. Dacunha-Castelle, F. Gamboa: Maximum d'entropie et problème des moments..Ann. Inst. H. Poincaré Probab. Statist. 26 (1990), 567-596. Zbl 0788.62007, MR 1080586
Reference: [33] A. P. Dawid, P. D. Grünwald: Game theory, maximum entropy, minimum discrepancy, and robust Bayesian decision theory..Ann. Statist. 32 (2004), 1367-1433. Zbl 1048.62008, MR 2089128, 10.1214/009053604000000553
Reference: [34] S. Eguchi: Information geometry and statistical pattern recognition..Sugaku Expositions, Amer. Math. Soc. 19 (2006), 197-216. MR 2279777
Reference: [35] B. A. Frigyik, S. Srivastava, M. R. Gupta: Functional Bregman divergence and Bayesian estimation of distributions..IEEE Trans. Inform. Theory 54 (2008), 5130-5139. MR 2589887, 10.1109/TIT.2008.929943
Reference: [36] F. Gamboa, E. Gassiat: Bayesian methods and maximum entropy for ill-posed inverse problems..Ann. Statist. 25 (1997), 1, 328-350. Zbl 0871.62010, MR 1429928, 10.1214/aos/1034276632
Reference: [37] E. T. Jaynes: Information theory and statistical mechanics..Physical Review Ser. II 106 (1957), 620-630. Zbl 0084.43701, MR 0087305
Reference: [38] L. Jones, C. Byrne: General entropy criteria for inverse problems with application to data compression, pattern classification and cluster analysis..IEEE Trans. Inform. Theory 36 (1990), 23-30. MR 1043277, 10.1109/18.50370
Reference: [39] S. Kullback: Information Theory and Statistics..John Wiley and Sons, New York 1959. Zbl 0897.62003, MR 0103557
Reference: [40] S. Kullback, R. A. Leibler: On information and sufficiency..Ann. Math. Statist. 22 (1951), 79-86. Zbl 0042.38403, MR 0039968, 10.1214/aoms/1177729694
Reference: [41] C. Léonard: Minimizers of energy functionals..Acta Math. Hungar. 93 (2001), 281-325. Zbl 1052.49017, MR 1925356, 10.1023/A:1017919422086
Reference: [42] C. Léonard: Minimizers of energy functionals under not very integrable constraints..J. Convex Anal. 10 (2003), 63-68. MR 1999902
Reference: [43] C. Léonard: Minimization of entropy functionals..J. Math. Anal. Appl. 346 (2008), 183-204. Zbl 1152.49039, MR 2428283, 10.1016/j.jmaa.2008.04.048
Reference: [44] C. Léonard: Entropic projections and dominating points..ESAIM: Probability and Statistics 14 (2010), 343-381. Zbl 1220.60018, MR 2795471, 10.1051/ps/2009003
Reference: [45] F. Liese, I. Vajda: Convex Statistical Distances..Teubner Texte zur Mathematik 95, Teubner Verlag, Leipzig 1986. Zbl 0656.62004, MR 0926905
Reference: [46] N. Murata, T. Takenouchi, T. Kanamori, S. Eguchi: Information geometry of U-Boost and Bregman divergence..Neural Computation 16 (2004), 1437-1481. Zbl 1102.68489, 10.1162/089976604323057452
Reference: [47] R. T. Rockafellar: Integrals which are convex functionals..Pacific J. Math. 24 (1968), 525-539. Zbl 0324.90061, MR 0236689, 10.2140/pjm.1968.24.525
Reference: [48] R. T. Rockafellar: Convex integral functionals and duality..In: Contributions to Nonlinear Functional Analysis (E. H. Zarantonello, ed.), Academic Press, New York 1971, pp. 215-236. Zbl 0326.49008, MR 0390870
Reference: [49] R. T. Rockafellar: Convex Analysis..Princeton University Press, Princeton 1970. Zbl 1011.49013, MR 0274683
Reference: [50] R. T. Rockafellar, R. J.-B. Wets: Variational Analysis..Springer Verlag, Berlin - Heidel\-berg - New York 2004. Zbl 0888.49001, MR 1491362
Reference: [51] M. Teboulle, I. Vajda: Convergence of best $\phi$-entropy estimates..IEEE Trans. Inform. Theory 39 (1993), 297-301. Zbl 0765.94001, MR 1211512, 10.1109/18.179378
Reference: [52] F. Topsoe: Information-theoretical optimization techniques..Kybernetika 15 (1979), 8-27. MR 0529888
Reference: [53] I. Vajda: Theory of Statistical Inference and Information..Kluwer Academic Puplishers, Dordrecht 1989. Zbl 0711.62002
.

Files

Files Size Format View
Kybernetika_48-2012-4_4.pdf 762.7Kb application/pdf View/Open
Back to standard record
Partner of
EuDML logo