[1] S. M. Ali, S. D. Silvey: 
A general class of coefficients of divergence of one distribution from another. J. Roy. Statist. Soc. Ser. B 28 (1966) 131-142. 
MR 0196777 | 
Zbl 0203.19902 
[2] S. Amari, H. Nagaoka: 
Methods of Information Geometry. Transl. Math. Monographs 191, Oxford Univ. Press, 2000. 
MR 1800071 | 
Zbl 1146.62001 
[3] S. Amari, A. Cichocki: Information geometry of divergence functions. Bull. Polish Acad. Sci. 58 (2010) 183-194.
[4] O. Barndorff-Nielsen: 
Information and Exponential Families in Statistical Theory. Wiley, 1978. 
MR 0489333 | 
Zbl 0387.62011 
[5] H. H. Bauschke, J. M. Borwein: 
Legendre functions and the method of random Bregman projections. J. Convex Anal. 4 (1997), 27-67. 
MR 1459881 | 
Zbl 0894.49019 
[7] A. Ben-Tal, A. Charnes: 
A dual optimization framework for some problems of information theory and statistics. Problems Control Inform. Theory 8 (1979), 387-401. 
MR 0553884 | 
Zbl 0437.90078 
[10] J. M. Borwein, A. S. Lewis: 
Partially-finite programming in $L_1$ and the existence of maximum entropy estimates. SIAM J. Optim. 3 (1993), 248-267. 
DOI 10.1137/0803012 | 
MR 1215444 
[11] J. M. Borwein, A. S. Lewis, D. Noll: 
Maximum entropy spectral analysis using derivative information. Part I: Fisher information and convex duality. Math. Oper. Res. 21 (1996), 442-468. 
DOI 10.1287/moor.21.2.442 | 
MR 1397223 
[13] M. Broniatowski, A. Keziou: 
Minimization of $\phi$-divergences on sets of signed measures. Studia Sci. Math. Hungar. 43 (2006), 403-442. 
MR 2273419 | 
Zbl 1121.28004 
[14] J. P. Burg: Maximum entropy spectral analysis. Paper presented at 37th Meeting of Soc. Explor. Geophysicists, Oklahoma City 1967.
[15] J. P. Burg: Maximum entropy spectral analysis. Ph.D. Thesis, Dept. Geophysics, Stanford Univ., Stanford 1975.
[16] Y. Censor, S. A. Zenios: 
Parallel Optimization. Oxford University Press, New York 1997. 
MR 1486040 | 
Zbl 0945.90064 
[17] N. N. Chentsov: 
Statistical Decision Rules and Optimal Inference. Transl. Math. Monographs 53, American Math. Soc., Providence 1982. Russian original: Nauka, Moscow 1972. 
MR 0645898 | 
Zbl 0484.62008 
[18] I. Csiszár: 
Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten. Publ. Math. Inst. Hungar. Acad. Sci. 8 (1963), 85-108. 
MR 0164374 | 
Zbl 0124.08703 
[19] I. Csiszár: 
Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hungar. 2 (1967), 299-318. 
MR 0219345 | 
Zbl 0157.25802 
[25] I. Csiszár, F. Matúš: 
Convex cores of measures on $\mathcal{R}^d$. Studia Sci. Math. Hungar. 38 (2001), 177-190. 
MR 1877777 
[27] I. Csiszár, F. Matúš: Generalized maximum likelihood estimates for infinite dimensional exponential families. In: Proc. Prague Stochastics'06, Prague 2006, pp. 288-297.
[29] I. Csiszár, F. Matúš: On minimization of entropy functionals under moment constraints. In: Proc. ISIT 2008, Toronto, pp. 2101-2105.
[30] I. Csiszár, F. Matúš: On minimization of multivariate entropy functionals. In: Proc. ITW 2009, Volos, Greece, pp. 96-100.
[31] I. Csiszár, F. Matúš: Minimization of entropy functionals revisited. In: Proc. ISIT 2012, Cambridge, MA, pp. 150-154.
[32] D. Dacunha-Castelle, F. Gamboa: 
Maximum d'entropie et problème des moments. Ann. Inst. H. Poincaré Probab. Statist. 26 (1990), 567-596. 
MR 1080586 | 
Zbl 0788.62007 
[34] S. Eguchi: 
Information geometry and statistical pattern recognition. Sugaku Expositions, Amer. Math. Soc. 19 (2006), 197-216. 
MR 2279777 
[35] B. A. Frigyik, S. Srivastava, M. R. Gupta: 
Functional Bregman divergence and Bayesian estimation of distributions. IEEE Trans. Inform. Theory 54 (2008), 5130-5139. 
DOI 10.1109/TIT.2008.929943 | 
MR 2589887 
[37] E. T. Jaynes: 
Information theory and statistical mechanics. Physical Review Ser. II 106 (1957), 620-630. 
MR 0087305 | 
Zbl 0084.43701 
[38] L. Jones, C. Byrne: 
General entropy criteria for inverse problems with application to data compression, pattern classification and cluster analysis. IEEE Trans. Inform. Theory 36 (1990), 23-30. 
DOI 10.1109/18.50370 | 
MR 1043277 
[42] C. Léonard: 
Minimizers of energy functionals under not very integrable constraints. J. Convex Anal. 10 (2003), 63-68. 
MR 1999902 
[45] F. Liese, I. Vajda: 
Convex Statistical Distances. Teubner Texte zur Mathematik 95, Teubner Verlag, Leipzig 1986. 
MR 0926905 | 
Zbl 0656.62004 
[48] R. T. Rockafellar: 
Convex integral functionals and duality. In: Contributions to Nonlinear Functional Analysis (E. H. Zarantonello, ed.), Academic Press, New York 1971, pp. 215-236. 
MR 0390870 | 
Zbl 0326.49008 
[50] R. T. Rockafellar, R. J.-B. Wets: 
Variational Analysis. Springer Verlag, Berlin - Heidel\-berg - New York 2004. 
MR 1491362 | 
Zbl 0888.49001 
[52] F. Topsoe: 
Information-theoretical optimization techniques. Kybernetika 15 (1979), 8-27. 
MR 0529888 
[53] I. Vajda: 
Theory of Statistical Inference and Information. Kluwer Academic Puplishers, Dordrecht 1989. 
Zbl 0711.62002