Title:
|
Generalized minimizers of convex integral functionals, Bregman distance, Pythagorean identities (English) |
Author:
|
Csiszár, Imre |
Author:
|
Matúš, František |
Language:
|
English |
Journal:
|
Kybernetika |
ISSN:
|
0023-5954 |
Volume:
|
48 |
Issue:
|
4 |
Year:
|
2012 |
Pages:
|
637-689 |
Summary lang:
|
English |
. |
Category:
|
math |
. |
Summary:
|
Integral functionals based on convex normal integrands are minimized subject to finitely many moment constraints. The integrands are finite on the positive and infinite on the negative numbers, strictly convex but not necessarily differentiable. The minimization is viewed as a primal problem and studied together with a dual one in the framework of convex duality. The effective domain of the value function is described by a conic core, a modification of the earlier concept of convex core. Minimizers and generalized minimizers are explicitly constructed from solutions of modified dual problems, not assuming the primal constraint qualification. A generalized Pythagorean identity is presented using Bregman distance and a correction term for lack of essential smoothness in integrands. Results are applied to minimization of Bregman distances. Existence of a generalized dual solution is established whenever the dual value is finite, assuming the dual constraint qualification. Examples of ‘irregular’ situations are included, pointing to the limitations of generality of certain key results. (English) |
Keyword:
|
maximum entropy |
Keyword:
|
moment constraint |
Keyword:
|
generalized primal/dual solutions |
Keyword:
|
normal integrand |
Keyword:
|
minimizing sequence |
Keyword:
|
convex duality |
Keyword:
|
Bregman projection |
Keyword:
|
conic core |
Keyword:
|
generalized exponential family |
Keyword:
|
inference principles |
MSC:
|
49J53 |
MSC:
|
49K30 |
MSC:
|
62B10 |
MSC:
|
65K10 |
MSC:
|
90C46 |
MSC:
|
94A17 |
idMR:
|
MR3013394 |
. |
Date available:
|
2012-11-10T22:00:42Z |
Last updated:
|
2013-09-24 |
Stable URL:
|
http://hdl.handle.net/10338.dmlcz/143055 |
. |
Reference:
|
[1] S. M. Ali, S. D. Silvey: A general class of coefficients of divergence of one distribution from another..J. Roy. Statist. Soc. Ser. B 28 (1966) 131-142. Zbl 0203.19902, MR 0196777 |
Reference:
|
[2] S. Amari, H. Nagaoka: Methods of Information Geometry..Transl. Math. Monographs 191, Oxford Univ. Press, 2000. Zbl 1146.62001, MR 1800071 |
Reference:
|
[3] S. Amari, A. Cichocki: Information geometry of divergence functions..Bull. Polish Acad. Sci. 58 (2010) 183-194. |
Reference:
|
[4] O. Barndorff-Nielsen: Information and Exponential Families in Statistical Theory..Wiley, 1978. Zbl 0387.62011, MR 0489333 |
Reference:
|
[5] H. H. Bauschke, J. M. Borwein: Legendre functions and the method of random Bregman projections..J. Convex Anal. 4 (1997), 27-67. Zbl 0894.49019, MR 1459881 |
Reference:
|
[6] H. H. Bauschke, J. M. Borwein, P. L. Combettes: Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces..Comm. Contemp. Math. 3 (2001), 615-647. Zbl 1032.49025, MR 1869107, 10.1142/S0219199701000524 |
Reference:
|
[7] A. Ben-Tal, A. Charnes: A dual optimization framework for some problems of information theory and statistics..Problems Control Inform. Theory 8 (1979), 387-401. Zbl 0437.90078, MR 0553884 |
Reference:
|
[8] J. M. Borwein, A. S. Lewis: Duality relationships for entropy-like minimization problems..SIAM J. Control Optim. 29 (1991), 325-338. Zbl 0797.49030, MR 1092730, 10.1137/0329017 |
Reference:
|
[9] J. M. Borwein, A. S. Lewis: Convergence of best entropy estimates..SIAM J. Optim. 1 (1991), 191-205. Zbl 0756.41037, MR 1098426, 10.1137/0801014 |
Reference:
|
[10] J. M. Borwein, A. S. Lewis: Partially-finite programming in $L_1$ and the existence of maximum entropy estimates..SIAM J. Optim. 3 (1993), 248-267. MR 1215444, 10.1137/0803012 |
Reference:
|
[11] J. M. Borwein, A. S. Lewis, D. Noll: Maximum entropy spectral analysis using derivative information. Part I: Fisher information and convex duality..Math. Oper. Res. 21 (1996), 442-468. MR 1397223, 10.1287/moor.21.2.442 |
Reference:
|
[12] L. M. Bregman: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming..USSR Comput. Math. and Math. Phys. 7 (1967), 200-217. Zbl 0186.23807, MR 0215617, 10.1016/0041-5553(67)90040-7 |
Reference:
|
[13] M. Broniatowski, A. Keziou: Minimization of $\phi$-divergences on sets of signed measures..Studia Sci. Math. Hungar. 43 (2006), 403-442. Zbl 1121.28004, MR 2273419 |
Reference:
|
[14] J. P. Burg: Maximum entropy spectral analysis..Paper presented at 37th Meeting of Soc. Explor. Geophysicists, Oklahoma City 1967. |
Reference:
|
[15] J. P. Burg: Maximum entropy spectral analysis..Ph.D. Thesis, Dept. Geophysics, Stanford Univ., Stanford 1975. |
Reference:
|
[16] Y. Censor, S. A. Zenios: Parallel Optimization..Oxford University Press, New York 1997. Zbl 0945.90064, MR 1486040 |
Reference:
|
[17] N. N. Chentsov: Statistical Decision Rules and Optimal Inference..Transl. Math. Monographs 53, American Math. Soc., Providence 1982. Russian original: Nauka, Moscow 1972. Zbl 0484.62008, MR 0645898 |
Reference:
|
[18] I. Csiszár: Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten..Publ. Math. Inst. Hungar. Acad. Sci. 8 (1963), 85-108. Zbl 0124.08703, MR 0164374 |
Reference:
|
[19] I. Csiszár: Information-type measures of difference of probability distributions and indirect observations..Studia Sci. Math. Hungar. 2 (1967), 299-318. Zbl 0157.25802, MR 0219345 |
Reference:
|
[20] I. Csiszár: $I$-divergence geometry of probability distributions and minimization problems..Ann. Probab. 3 (1975), 146-158. MR 0365798, 10.1214/aop/1176996454 |
Reference:
|
[21] I. Csiszár: Sanov property, generalized $I$-projection and a conditional limit theorem..Ann. Probab. 12 (1984), 768-793. Zbl 0544.60011, MR 0744233, 10.1214/aop/1176993227 |
Reference:
|
[22] I. Csiszár: Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems..Ann. Statist. 19 (1991), 2031-2066. Zbl 0753.62003, MR 1135163, 10.1214/aos/1176348385 |
Reference:
|
[23] I. Csiszár: Generalized projections for non-negative functions..Acta Math. Hungar. 68 (1995), 1-2, 161-185. Zbl 0837.62006, MR 1320794, 10.1007/BF01874442 |
Reference:
|
[24] I. Csiszár, F. Gamboa, E. Gassiat: MEM pixel correlated solutions for generalized moment and interpolation problems..IEEE Trans. Inform. Theory 45 (1999), 2253-2270. Zbl 0958.94002, MR 1725114, 10.1109/18.796367 |
Reference:
|
[25] I. Csiszár, F. Matúš: Convex cores of measures on $\mathcal{R}^d$..Studia Sci. Math. Hungar. 38 (2001), 177-190. MR 1877777 |
Reference:
|
[26] I. Csiszár, F. Matúš: Information projections revisited..IEEE Trans. Inform. Theory 49 (2003), 1474-1490. Zbl 1063.94016, MR 1984936, 10.1109/TIT.2003.810633 |
Reference:
|
[27] I. Csiszár, F. Matúš: Generalized maximum likelihood estimates for infinite dimensional exponential families..In: Proc. Prague Stochastics'06, Prague 2006, pp. 288-297. |
Reference:
|
[28] I. Csiszár, F. Matúš: Generalized maximum likelihood estimates for exponential families..Probab. Theory Related Fields 141 (2008), 213-246. Zbl 1133.62039, MR 2372970, 10.1007/s00440-007-0084-z |
Reference:
|
[29] I. Csiszár, F. Matúš: On minimization of entropy functionals under moment constraints..In: Proc. ISIT 2008, Toronto, pp. 2101-2105. |
Reference:
|
[30] I. Csiszár, F. Matúš: On minimization of multivariate entropy functionals..In: Proc. ITW 2009, Volos, Greece, pp. 96-100. |
Reference:
|
[31] I. Csiszár, F. Matúš: Minimization of entropy functionals revisited..In: Proc. ISIT 2012, Cambridge, MA, pp. 150-154. |
Reference:
|
[32] D. Dacunha-Castelle, F. Gamboa: Maximum d'entropie et problème des moments..Ann. Inst. H. Poincaré Probab. Statist. 26 (1990), 567-596. Zbl 0788.62007, MR 1080586 |
Reference:
|
[33] A. P. Dawid, P. D. Grünwald: Game theory, maximum entropy, minimum discrepancy, and robust Bayesian decision theory..Ann. Statist. 32 (2004), 1367-1433. Zbl 1048.62008, MR 2089128, 10.1214/009053604000000553 |
Reference:
|
[34] S. Eguchi: Information geometry and statistical pattern recognition..Sugaku Expositions, Amer. Math. Soc. 19 (2006), 197-216. MR 2279777 |
Reference:
|
[35] B. A. Frigyik, S. Srivastava, M. R. Gupta: Functional Bregman divergence and Bayesian estimation of distributions..IEEE Trans. Inform. Theory 54 (2008), 5130-5139. MR 2589887, 10.1109/TIT.2008.929943 |
Reference:
|
[36] F. Gamboa, E. Gassiat: Bayesian methods and maximum entropy for ill-posed inverse problems..Ann. Statist. 25 (1997), 1, 328-350. Zbl 0871.62010, MR 1429928, 10.1214/aos/1034276632 |
Reference:
|
[37] E. T. Jaynes: Information theory and statistical mechanics..Physical Review Ser. II 106 (1957), 620-630. Zbl 0084.43701, MR 0087305 |
Reference:
|
[38] L. Jones, C. Byrne: General entropy criteria for inverse problems with application to data compression, pattern classification and cluster analysis..IEEE Trans. Inform. Theory 36 (1990), 23-30. MR 1043277, 10.1109/18.50370 |
Reference:
|
[39] S. Kullback: Information Theory and Statistics..John Wiley and Sons, New York 1959. Zbl 0897.62003, MR 0103557 |
Reference:
|
[40] S. Kullback, R. A. Leibler: On information and sufficiency..Ann. Math. Statist. 22 (1951), 79-86. Zbl 0042.38403, MR 0039968, 10.1214/aoms/1177729694 |
Reference:
|
[41] C. Léonard: Minimizers of energy functionals..Acta Math. Hungar. 93 (2001), 281-325. Zbl 1052.49017, MR 1925356, 10.1023/A:1017919422086 |
Reference:
|
[42] C. Léonard: Minimizers of energy functionals under not very integrable constraints..J. Convex Anal. 10 (2003), 63-68. MR 1999902 |
Reference:
|
[43] C. Léonard: Minimization of entropy functionals..J. Math. Anal. Appl. 346 (2008), 183-204. Zbl 1152.49039, MR 2428283, 10.1016/j.jmaa.2008.04.048 |
Reference:
|
[44] C. Léonard: Entropic projections and dominating points..ESAIM: Probability and Statistics 14 (2010), 343-381. Zbl 1220.60018, MR 2795471, 10.1051/ps/2009003 |
Reference:
|
[45] F. Liese, I. Vajda: Convex Statistical Distances..Teubner Texte zur Mathematik 95, Teubner Verlag, Leipzig 1986. Zbl 0656.62004, MR 0926905 |
Reference:
|
[46] N. Murata, T. Takenouchi, T. Kanamori, S. Eguchi: Information geometry of U-Boost and Bregman divergence..Neural Computation 16 (2004), 1437-1481. Zbl 1102.68489, 10.1162/089976604323057452 |
Reference:
|
[47] R. T. Rockafellar: Integrals which are convex functionals..Pacific J. Math. 24 (1968), 525-539. Zbl 0324.90061, MR 0236689, 10.2140/pjm.1968.24.525 |
Reference:
|
[48] R. T. Rockafellar: Convex integral functionals and duality..In: Contributions to Nonlinear Functional Analysis (E. H. Zarantonello, ed.), Academic Press, New York 1971, pp. 215-236. Zbl 0326.49008, MR 0390870 |
Reference:
|
[49] R. T. Rockafellar: Convex Analysis..Princeton University Press, Princeton 1970. Zbl 1011.49013, MR 0274683 |
Reference:
|
[50] R. T. Rockafellar, R. J.-B. Wets: Variational Analysis..Springer Verlag, Berlin - Heidel\-berg - New York 2004. Zbl 0888.49001, MR 1491362 |
Reference:
|
[51] M. Teboulle, I. Vajda: Convergence of best $\phi$-entropy estimates..IEEE Trans. Inform. Theory 39 (1993), 297-301. Zbl 0765.94001, MR 1211512, 10.1109/18.179378 |
Reference:
|
[52] F. Topsoe: Information-theoretical optimization techniques..Kybernetika 15 (1979), 8-27. MR 0529888 |
Reference:
|
[53] I. Vajda: Theory of Statistical Inference and Information..Kluwer Academic Puplishers, Dordrecht 1989. Zbl 0711.62002 |
. |