[1] Agresti A.: 
Categorical Data Analysis. Wiley, New York 2002 
MR 1914507[2] Andersen E. B.: 
The Statistical Analysis of Categorical Data. Springer, New York 1990 
Zbl 0871.62050[3] Ali S. M., Silvey S. D.: 
A general class of coefficient of divergence of one distribution from another. J. Roy. Statist. Soc. 28 (1966), 131–142 
MR 0196777[4] Csiszár I.: Eine Informationstheoretische Ungleichung und ihre Anwendung auf den Bewis der Ergodizität on Markhoffschen Ketten. Publ. Math. Inst. Hungar. Acad. Sci. 8 (1963), 84–108
[5] Dale J. R.: 
Asymptotic normality of goodness-of-fit statistics for sparse product multinomials. J. Roy. Statist. Soc. Ser. B 41 (1986), 48–59 
MR 0848050 | 
Zbl 0611.62017[6] Haber M., Brown M. B.: 
Maximum likelihood methods for log-linear models when expected frequencies are subject to linear constraints. J. Amer. Statist. Assoc. 81 (1986), 477–482 
MR 0845886 | 
Zbl 0604.62058[7] Kullback S.: 
Kullback information. In: Encyclopedia of Statistical Sciences (S. Kotz and N. L. Johnson, eds.), Wiley, New York 1985, Volume 4, pp. 421–425 
MR 1044999[10] Powers D. A., Xie Y.: 
Statistical Methods for Categorical Data Analysis. Academic Press, San Diego 2000 
MR 1735454 | 
Zbl 0967.62101[11] Rényi A.: On measures of entropy and information. Proc. Fourth Berkeley Symposium on Mathematical Statistics and Probability 1 (1961), pp. 547–561
[12] Vajda I.: 
Theory of Statistical Inference and Information. Kluwer Academic Publishers, Dordrecht 1989 
Zbl 0711.62002