Previous |  Up |  Next

Article

Title: Consensus clustering with differential evolution (English)
Author: Sabo, Miroslav
Language: English
Journal: Kybernetika
ISSN: 0023-5954 (print)
ISSN: 1805-949X (online)
Volume: 50
Issue: 5
Year: 2014
Pages: 661-678
Summary lang: English
.
Category: math
.
Summary: Consensus clustering algorithms are used to improve properties of traditional clustering methods, especially their accuracy and robustness. In this article, we introduce our approach that is based on a refinement of the set of initial partitions and uses differential evolution algorithm in order to find the most valid solution. Properties of the algorithm are demonstrated on four benchmark datasets. (English)
Keyword: consensus clustering
Keyword: differential evolution
Keyword: ensemble
Keyword: data
MSC: 62H30
MSC: 92G30
idZBL: Zbl 1308.62132
idMR: MR3301853
DOI: 10.14736/kyb-2014-5-0661
.
Date available: 2015-01-13T09:20:18Z
Last updated: 2016-01-03
Stable URL: http://hdl.handle.net/10338.dmlcz/144099
.
Reference: [1] Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications..In: Proc. 2001 ACM SIGMOD International Conference on Management of data 27 (1998), 2, pp. 94-105.
Reference: [2] Bache, K., Lichman, M.: UCI machine learning repository, 2013..URL http://archive.ics.uci.edu/ml.
Reference: [3] Bailey, K. D.: Typologies and Taxonomies: An Introduction to Classification Techniques..Sage Publications Inc., Los Angeles 1994.
Reference: [4] Bezdek, J. C.: Pattern Recognition with Fuzzy Objective Function Algorithms..Plenum Press, New York 1981. Zbl 0503.68069, MR 0631231
Reference: [5] Das, S., Abraham, A., Konar, A.: Automatic clustering using an improved differential evolution algorithm..IEEE Trans. Sys. Man Cyber., Part A: Systems and Humans 38 (2008), 1, 218-237. 10.1109/TSMCA.2007.909595
Reference: [6] Dempster, A. P., Laird, N. M., Rubin, D. B.: Maximum likelihood from incomplete data via the em algorithm..J. Roy. Stat. Soc. Ser. B 39 (1977), 1, 1-38. Zbl 0364.62022, MR 0501537
Reference: [7] Dimitriadou, E.: cclust: Convex Clustering Methods and Clustering Indexes, 2012..URL http://CRAN.R-project.org/package=cclust.
Reference: [8] Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure..Bioinformatics 19 (2003), 9, 1090-2003. 10.1093/bioinformatics/btg038
Reference: [9] Ester, M., Kriegel, H. P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise..In: Proc. 2nd International Conference on Knowledge Discovery and Data Mining 1996, pp. 226-231.
Reference: [10] Fern, X., Brodley, C.: Solving cluster ensemble problems by bipartite graph partitioning..In: Proc. 21st International Conference on Machine learning 2004, pp. 36-43.
Reference: [11] Fraley, C., Raftery, A. E.: Model-based clustering, discriminant analysis and density estimation..J. Amer. Statist. Assoc. 97 (2002), 611-631. Zbl 1073.62545, MR 1951635, 10.1198/016214502760047131
Reference: [12] Fraley, C., Raftery, A. E.: MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering..Techn. Report 504, University of Washington, Department of Statistics, 2006.
Reference: [13] Ghaemi, R., Sulaiman, N., Ibrahim, H., Mustapha, N.: A survey: Clustering ensembles techniques..In: Proc. International Conference on Computer, Electrical, and Systems Science, and Engineering (CESSE) 38 (2009), pp. 644-653.
Reference: [14] Ghosh, J., Acharya, A.: Cluster ensembles..Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery 1 (2011), 4, 305-315.
Reference: [15] Gould, S. J.: Full House: The Spread of Excellence from Plato to Darwin..Harmony, New York 1996.
Reference: [16] Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Cluster validity methods: Part i..SIGMOD Record 31 (2002), 2, 40-45. 10.1145/565117.565124
Reference: [17] Handl, J., Knowles, J.: Multi-objective clustering and cluster validation..In: Multi-Objective Machine Learning (Studies in Computational Intelligence, Vol, 16), Springer, Berlin 2006, pp. 21-47.
Reference: [18] Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering..IEEE Trans. Evolutionary Comput. 11 (2007), 56-76. 10.1109/TEVC.2006.877146
Reference: [19] Handl, J., Knowles, J., Kell, D.: Computational cluster validation in post-genomic data analysis..Bioinformatics 21 (2005), 15, 3201-3212. 10.1093/bioinformatics/bti517
Reference: [20] Hartigan, J., Wong, M.: A k-means clustering algorithm..Applied Statistics 28 (1979), 100-108. Zbl 0447.62062, 10.2307/2346830
Reference: [21] Hornik, K., Feinerer, I., Kober, M., Buchta, C.: Spherical $k$-means clustering..J. Statist. Software 50 (2012), 10, 1-22. 10.18637/jss.v050.i10
Reference: [22] Hruschka, E., Campello, R., Freitas, A., Carvalho, A. de: A survey of evolutionary algorithms for clustering..IEEE Trans. Sys. Man Cyber. Part C: Applications and Reviews 39 (2009), 2, 133-155. 10.1109/TSMCC.2008.2007252
Reference: [23] Jain, A. K.: Data clustering: 50 years beyond k-means..Pattern Recognition Lett. 31 (2010), 8, 651-666.
Reference: [24] Jain, A. K., Murty, M. N., Flynn, P. J.: Data clustering: A review..ACM Comput. Surveys 31 (1999), 3, 316-323.
Reference: [25] Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab - an S4 package for kernel methods in R..J. Statist. Software 11 (2004), 9, 1-20.
Reference: [26] Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: Applications in vlsi domain..In: Proc. Design and Automation Conference, 1997, pp. 526-529.
Reference: [27] Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis..Wiley, New York 1990. MR 1044997
Reference: [28] Krishna, K., Murty, M. Narasimha: Genetic k-means algorithm..Trans. Sys. Man Cyber. Part B 29 (1999), 3, 433-439. 10.1109/3477.764879
Reference: [29] Kwedlo, W.: A clustering method combining differential evolution with the k-means algorithm..Pattern Recognition Letters 32 (2011), 12, 1613-1621. 10.1016/j.patrec.2011.05.010
Reference: [30] MacQueen, J.: Some methods for classification and analysis of multivariate observations..In: Proc. Fifth Berkeley Symposium on Mathematical Statistics and Probability 1 (1967), pp. 281-297. Zbl 0214.46201, MR 0214227
Reference: [31] Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.: cluster: Cluster Analysis Basics and Extensions, 2013..R package version 1.14.4.
Reference: [32] Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data..Mach. Learn. 52 (2003), 1-2, 91-118. Zbl 1039.68103, 10.1023/A:1023949509487
Reference: [33] Mullen, K., Ardia, D., Gil, D., Windover, D., Cline, J.: DEoptim: An R package for global optimization by differential evolution..J. Statist. Software 40 (2011), 6, 1-26. 10.18637/jss.v040.i06
Reference: [34] Murthy, C., Chowdhury, N.: In search of optimal clusters using genetic algorithms..Pattern Recognition Lett. 17 (1996), 8, 825-832.
Reference: [35] Pal, S. K., Majumder, D. D.: Fuzzy sets and decision making approaches in vowel and speaker recognition..IEEE Trans. Sys. Man Cyber. 7 (1977), 625-629. 10.1109/TSMC.1977.4309789
Reference: [36] Paterlini, S., Krink, T.: Differential evolution and particle swarm optimisation in partitional clustering..Comput. Statist. Data Anal. 50 (2006), 5, 1220-1247. MR 2224370, 10.1016/j.csda.2004.12.004
Reference: [37] Price, K. V., Storn, R. M., Lampinen, J. A.: Differential Evolution: A Practical Approach to Global Optimization..Springer-Verlag, Berlin 2006. Zbl 1186.90004, MR 2191377
Reference: [38] Raghavan, V., Birchand, K.: A clustering strategy based on a formalism of the reproductive process in a natural system..In: Proc. Second International Conference on Information Storage and Retrieval, 1979, pp. 10-22.
Reference: [39] R Core Team: R: A Language and Environment for Statistical Computing..R Foundation for Statistical Computing, Vienna 2012. URL http://www.R-project.org/.
Reference: [40] Shi, J., Malik, J.: Normalized cuts and image segmentation..In: IEEE Trans. Pattern Analysis and Machine Intelligence 22 (2000), 8, 888-905.
Reference: [41] Simovici, D. A., Djeraba, Ch.: Mathematical Tools for Data Mining: Set Theory, Partial Orders, Combinatorics..Advanced information and knowledge processing. Springer, London 2008. Zbl 1151.68386, MR 2451001
Reference: [42] Simpson, T. I., Armstrong, J. D., Jarman, A. P.: Merged consensus clustering to assess and improve class discovery with microarray data..BMC Bioinform. 11 (2010), 11-590. 10.1186/1471-2105-11-590
Reference: [43] Sneath, P. H.: The application of computers to taxonomy..Journal of general microbiology 17 (1957), 1, 201-226. 10.1099/00221287-17-1-201
Reference: [44] Storn, R., Price, K.: Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces..J. Global Optim. 11 (1997), 4, 341-359. Zbl 0888.90135, MR 1479553, 10.1023/A:1008202821328
Reference: [45] Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining partitionings..In: Proc. 11th National Conference On Artificial Intelligence, NCAI, Edmonton, Alberta 2002, pp. 93-98. MR 1991087
Reference: [46] Topchy, A., Jain, A., Punch, W.: A mixture model of clustering ensembles..In: Proc. SIAM International Conference on Data Mining 2004, pp. 22-24.
Reference: [47] Trotter, W. M.: Combinatorics and Partially Ordered Sets..The Johns Hopkins University Press, Baltimore 1992. Zbl 0764.05001, MR 1169299
Reference: [48] Tvrdík, J., Křivý, I.: Differential evolution with competing strategies applied to partitional clustering..Lecture Notes Comput. Sci. 7269 (2012), 136-144. 10.1007/978-3-642-29353-5_16
Reference: [49] Wang, P., Domeniconi, C., Laskey, K.: Nonparametric bayesian clustering ensembles..Lecture Notes Comput. Sci. 6323 (2010), 3, 435-450. 10.1007/978-3-642-15939-8_28
Reference: [50] Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles..Stat. Anal. Data Min. 4 (2011), 1, 54-70. MR 2814500, 10.1002/sam.10098
Reference: [51] Wikipedia: Partition of a set..http://en.wikipedia.org/wiki/Partition_of_a_set.
Reference: [52] Xu, R., Wunsch, D.: Survey of clustering algorithms..IEEE Trans. Neural Networks 16 (2005), 3, 645-678. 10.1109/TNN.2005.845141
Reference: [53] Zahn, Ch. T.: Graph-theoretic methods for detecting and describing gestalt clusters..IEEE Trans. Comput. 20 (1971), 31, 68-86.
.

Files

Files Size Format View
Kybernetika_50-2014-5_3.pdf 378.9Kb application/pdf View/Open
Back to standard record
Partner of
EuDML logo