Previous |  Up |  Next


Title: Mixture of experts architectures for neural networks as a special case of conditional expectation formula (English)
Author: Grim, Jiří
Language: English
Journal: Kybernetika
ISSN: 0023-5954
Volume: 34
Issue: 4
Year: 1998
Pages: [417]-422
Summary lang: English
Category: math
Summary: Recently a new interesting architecture of neural networks called “mixture of experts” has been proposed as a tool of real multivariate approximation or prediction. We show that the underlying problem is closely related to approximating the joint probability density of involved variables by finite mixture. Particularly, assuming normal mixtures, we can explicitly write the conditional expectation formula which can be interpreted as a mixture-of- experts network. In this way the related optimization problem can be reduced to standard estimation of normal mixtures by means of EM algorithm. The resulting prediction is optimal in the sense of minimum dispersion if the assumed mixture model is true. It is shown that some of the recently published results can be obtained by specifying the normal components of mixtures in a special form. (English)
Keyword: neural networks
Keyword: mixtures
Keyword: multivariate approximation
Keyword: prediction
MSC: 68T05
MSC: 92B20
idZBL: Zbl 1274.68314
Date available: 2009-09-24T19:18:16Z
Last updated: 2015-03-28
Stable URL:
Reference: [1] Dempster A. P., Laird N. M., Rubin D. B.: Maximum likelihood from incomplete data via the EM algorithm.J. Roy. Statist. Soc. ser. B 39 (1977), 1–38 Zbl 0364.62022, MR 0501537
Reference: [2] Grim J.: On numerical evaluation of maximum–likelihood estimates for finite mixtures of distributions.Kybernetika 18 (1982), 3, 173–190 Zbl 0489.62028, MR 0680154
Reference: [3] Grim J.: Maximum likelihood design of layered neural networks.In: IEEE Proceedings of the 13th International Conference on Pattern Recognition, IEEE Press 1996, pp. 85–89
Reference: [4] Grim J.: Design of multilayer neural networks by information preserving transforms.In: Proc. 3rd Systems Science European Congress (E. Pessa, M. B. Penna and A. Montesanto, eds.), Edizzioni Kappa, Roma 1996, pp. 977–982
Reference: [5] Jacobs R. A., Jordan M. I., Nowlan S. J., Hinton G. E.: Adaptive mixtures of local experts.Neural Comp. 3 (1991), 79–87 10.1162/neco.1991.3.1.79
Reference: [6] Jordan M. I., Jacobs R. A.: Hierarchical mixtures of experts and the EM algorithm.Neural Comp. 6 (1994), 181–214 10.1162/neco.1994.6.2.181
Reference: [7] Chen, Ke, Xie, Dahong, Chi, Huisheng: A modified HME architecture for text–dependent speaker identification.IEEE Trans. Neural Networks 7 (1996), 1309–1313 10.1109/72.536325
Reference: [8] Ramamurti V., Ghosh J.: Structural adaptation in mixtures of experts.In: IEEE Proceedings of the 13th International Conference on Pattern Recognition, IEEE Press, 1996, pp. 704–708
Reference: [9] Titterington D. M., Smith A. F. M., Makov U. E.: Statistical Analysis of Finite Mixture Distributions.John Wiley & Sons, Chichester – Singapore – New York 1985 Zbl 0646.62013, MR 0838090
Reference: [10] Vajda I.: Theory of Statistical Inference and Information.Kluwer, Boston 1992 Zbl 0711.62002
Reference: [11] Wu C. F. J.: On the convergence properties of the EM algorithm.Ann. Statist. 11 (1983), 95–103 Zbl 0517.62035, MR 0684867, 10.1214/aos/1176346060
Reference: [12] Xu L., Jordan M. I.: On convergence properties of the EM algorithm for Gaussian mixtures.Neural Comp. 8 (1996), 129–151 10.1162/neco.1996.8.1.129
Reference: [13] Xu L., Jordan M. I., Hinton G. E.: A modified gating network for the mixtures of experts architecture.In: Proc. WCNN’94, San Diego 1994, Vol. 2, pp. 405–410


Files Size Format View
Kybernetika_34-1998-4_11.pdf 740.3Kb application/pdf View/Open
Back to standard record
Partner of
EuDML logo