Previous |  Up |  Next

Article

Keywords:
mixture models
Summary:
An iterative fuzzy clustering method is proposed to partition a set of multivariate binary observation vectors located at neighboring geographic sites. The method described here applies in a binary setup a recently proposed algorithm, called Neighborhood EM, which seeks a partition that is both well clustered in the feature space and spatially regular [AmbroiseNEM1996]. This approach is derived from the EM algorithm applied to mixture models [Dempster1977], viewed as an alternate optimization method [Hathaway1986]. The criterion optimized by EM is penalized by a spatial smoothing term that favors classes having many neighbors. The resulting algorithm has a structure similar to EM, with an unchanged M-step and an iterative E-step. The criterion optimized by Neighborhood EM is closely related to a posterior distribution with a multilevel logistic Markov random field as prior [Besag1986,Geman1984]. The application of this approach to binary data relies on a mixture of multivariate Bernoulli distributions [Govaert1990]. Experiments on simulated spatial binary data yield encouraging results.
References:
[1] Ambroise C.: Approche probabiliste en classification automatique et contraintes de voisinage. PhD Thesis, Université de Technologie de Compiègne 1996
[2] Ambroise C., Dang M. V., Govaert G.: Clustering of spatial data by the EM algorithm. In: Amílcar Soares (J. Gómez-Hernandez and R. Froidevaux, eds), geoENV I – Geostatistics for Environmental Applications, Kluwer Academic Publisher 1997, pp. 493–504
[3] Ambroise C., Govaert G.: An iterative algorithm for spatial clustering, submitte.
[4] Berry B. J. L.: Essay on Commodity Flows and the Spatial Structure of the Indian Economy. Research paper 111, Departement of Geography, University of Chicago 1966
[5] Besag J. E.: Spatial analysis of dirty pictures. J. Roy. Statist. Soc. 48 (1986), 259–302 MR 0876840
[6] Bezdek J. C., Castelaz P. F.: Prototype classification and feature selection with fuzzy sets. IEEE Trans. Systems Man Cybernet. SMC-7 (1977), 2, 87–92 DOI 10.1109/TSMC.1977.4309659 | Zbl 0359.68120
[7] Celeux G., Govaert G.: Clustering criteria for discrete data and latent class models. J. Classification 8 (1991), 157–176 DOI 10.1007/BF02616237 | Zbl 0775.62150
[8] Chalmond B.: An iterative gibbsian technique for reconstruction of m-ary images. Pattern Recognition 22 (1989), 6, 747–761 DOI 10.1016/0031-3203(89)90011-3
[9] Dempster A. P., Laird N. M., Rubin D. B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. 39 (1977), 1–38 MR 0501537 | Zbl 0364.62022
[10] Geman S., Geman D.: Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans. Pattern Analysis Machine Intelligence PAMI-6 (1984), 721–741 DOI 10.1109/TPAMI.1984.4767596 | Zbl 0573.62030
[11] Govaert G.: Classification binaire et modéles. Rev. Statist. Appl. 38 (1990), 1, 67–81
[12] Hathaway R. J.: Another interpretation of the EM algorithm for mixture distributions. Statist. Probab. Lett. 4 (1986), 53–56 DOI 10.1016/0167-7152(86)90016-7 | MR 0829432 | Zbl 0585.62052
[13] Legendre P.: Constrained clustering. Develop. Numerical Ecology. NATO ASI Series G 14 (1987), 289–307 MR 0913543
Partner of
EuDML logo