Title:
|
About the maximum information and maximum likelihood principles (English) |
Author:
|
Vajda, Igor |
Author:
|
Grim, Jiří |
Language:
|
English |
Journal:
|
Kybernetika |
ISSN:
|
0023-5954 |
Volume:
|
34 |
Issue:
|
4 |
Year:
|
1998 |
Pages:
|
[485]-494 |
Summary lang:
|
English |
. |
Category:
|
math |
. |
Summary:
|
Neural networks with radial basis functions are considered, and the Shannon information in their output concerning input. The role of information- preserving input transformations is discussed when the network is specified by the maximum information principle and by the maximum likelihood principle. A transformation is found which simplifies the input structure in the sense that it minimizes the entropy in the class of all information-preserving transformations. Such transformation need not be unique - under some assumptions it may be any minimal sufficient statistics. (English) |
Keyword:
|
neural networks |
Keyword:
|
radial basis functions |
Keyword:
|
entropy minimization |
MSC:
|
62B10 |
MSC:
|
62M45 |
MSC:
|
68T05 |
MSC:
|
92B20 |
idZBL:
|
Zbl 1274.62644 |
. |
Date available:
|
2009-09-24T19:19:42Z |
Last updated:
|
2015-03-28 |
Stable URL:
|
http://hdl.handle.net/10338.dmlcz/135236 |
. |
Reference:
|
[1] Atick J. J., Redlich A. N.: Towards a theory of early visual processing.Neural Computation 2 (1990), 308–320 10.1162/neco.1990.2.3.308 |
Reference:
|
[2] Attneave F.: Some informational aspects of visual perception.Psychological Review 61 (1954), 183–193 10.1037/h0054663 |
Reference:
|
[3] Becker S., Hinton G. E.: A self–organizing neural network that discovers surfaces in random–dot stereograms.Nature (London) 355 (1992), 161–163 10.1038/355161a0 |
Reference:
|
[4] Bromhead D. S., Lowe D.: Multivariate functional interpolation and adaptive networks.Complex Systems 2 (1988), 321–355 MR 0955557 |
Reference:
|
[5] Casdagli M.: Nonlinear prediction of chaotic time–series.Physica 35D (1989), 335–356 Zbl 0671.62099, MR 1004201 |
Reference:
|
[6] Cover T. M., Thomas J. B.: Elements of Information Theory.Wiley, New York 1991 Zbl 1140.94001, MR 1122806 |
Reference:
|
[7] Dempster A. P., Laird N. M., Rubin D. B.: Maximum likelihood from incomplete data via the EM algorithm.J. Roy. Statist. Soc. Ser. B 39 (1977), 1–38 Zbl 0364.62022, MR 0501537 |
Reference:
|
[8] Devroye L., Győrfi L.: Nonparametric Density Estimation: The $L_1$ View.John Wiley, New York 1985 MR 0780746 |
Reference:
|
[9] Devroye L., Győrfi L., Lugosi G.: A Probabilistic Theory of Pattern Recognition.Springer, New York 1996 MR 1383093 |
Reference:
|
[11] Haykin S.: Neural Networks: A Comprehensive Foundation.MacMillan, New York 1994 Zbl 0934.68076 |
Reference:
|
[12] Hertz J., Krogh A., Palmer R. G.: Introduction to the Theory of Neural Computation.Addison–Wesley, New York, Menlo Park CA, Amsterdam 1991 MR 1096298 |
Reference:
|
[13] Jacobs R. A., Jordan M. I.: A competitive modular connectionist architecture.In: Advances in Neural Information Processing Systems (R. P. Lippmann, J. E. Moody and D. J. Touretzky, eds.), Morgan Kaufman, San Mateo CA 1991, Vol. 3. pp. 767–773 |
Reference:
|
[14] Kay J.: Feature discovery under contextual supervision using mutual information.In: International Joint Conference on Neural Networks, Baltimore MD 1992, Vol. 4, pp. 79–84 |
Reference:
|
[15] Liese F., Vajda I.: Convex Statistical Distances.Teubner Verlag, Leipzig 1987 Zbl 0656.62004, MR 0926905 |
Reference:
|
[16] Linsker R.: Self–organization in perceptual network.Computer 21 (1988), 105–117 10.1109/2.36 |
Reference:
|
[17] Linsker R.: Perceptual neural organization: Some approaches based on network models and information theory.Annual Review of Neuroscience 13 (1990), 257–281 10.1146/annurev.ne.13.030190.001353 |
Reference:
|
[18] Lowe D.: Adaptive radial basis function nonlinearities, and the problem of generalization.In: First IEE International Conference on Artificial Neural Networks, 1989, pp. 95–99 |
Reference:
|
[19] Moody J., Darken C.: Fast learning in locally–tuned processing units.Neural Computation 1 (1989), 281–294 10.1162/neco.1989.1.2.281 |
Reference:
|
[20] Palm H. CH.: A new method for generating statistical classifiers assuming linear mixtures of Gaussiian densities.In: Proceedings of the 12th IAPR Int. Conference on Pattern Recognition, IEEE Computer Society Press Jerusalem 1994, Vol. II., pp. 483–486 |
Reference:
|
[21] Plumbley M. D.: A Hebbian/anti–Hebbian network which optimizes information capacity by orthonormalizing the principle subspace.In: IEE Artificial Neural Networks Conference, ANN-93, Brighton 1992, pp. 86–90 |
Reference:
|
[22] Plumbley M. D., Fallside F.: An information–theoretic approach to unsupervised connectionist models.In: Proceedings of the 1988 Connectionist Models Summer School, (D. Touretzky, G. Hinton and T. Sejnowski, eds.), Morgan Kaufmann, San Mateo 1988, pp. 239–245 |
Reference:
|
[23] Poggio T., Girosi F.: Regularization algorithms for learning that are eqivalent to multilayer networks.Science 247 (1990), 978–982 MR 1038271, 10.1126/science.247.4945.978 |
Reference:
|
[24] Rissanen J.: Stochastic Complexity in Statistical Inquiry.World Scientific, New Jersey 1989 Zbl 0800.68508, MR 1082556 |
Reference:
|
[25] Specht D. F.: Probabilistic neural networks for classification, mapping or associative memory.In: Proc. of the IEEE Int. Conference on Neural Networks, 1988, Vol. I., pp. 525–532 |
Reference:
|
[26] Shannon C. E.: A mathematical theory of communication.Bell System Technical Journal 27 (1948), 379–423, 623–656 Zbl 1154.94303, MR 0026286, 10.1002/j.1538-7305.1948.tb01338.x |
Reference:
|
[27] Streit L. R., Luginbuhl T. E.: Maximum likelihood training of probabilistic neural networks.IEEE Trans. Neural Networks 5 (1994), 5, 764–783 10.1109/72.317728 |
Reference:
|
[28] Vajda I., Grim J.: Bayesian optimality of decisions is achievable by RBF neural networks.IEEE Trans. Neural Networks, submitted |
Reference:
|
[29] Ukrainec A., Haykin S.: A modular neural network for unhancement of errors–polar radar targets.Neural Networks 9 (1996), 141–168 10.1016/0893-6080(95)00062-3 |
Reference:
|
[30] Uttley A. M.: The transmission of information and the effect of local feedback in theoretical and neural networks.Brain Research 102 (1966), 23–35 |
Reference:
|
[31] Watanabe S., Fukumizu K.: Probabilistic design of layered neural networks based on their unified framework.IEEE Trans. Neural Networks 6 (1995), 3, 691–702 10.1109/72.377974 |
Reference:
|
[32] Xu L., Jordan M. I.: EM learning on a generalized finite mixture model for combining multiple classifiers.In: World Congress on Neural Networks, 1993, Vol. 4, pp. 227–230 |
Reference:
|
[33] Xu L., Krzyżak A., Oja E.: Rival penalized competitive learning for clustering analysis, RBF net and curve detection.IEEE Trans. Neural Networks 4 (1993), 636–649 10.1109/72.238318 |
. |