Exemplar-Based Clustering via Simulated Annealing

Michael J. Brusco; Hans-Friedrich Köhn

doi:10.1007/s11336-009-9115-2

Exemplar-Based Clustering via Simulated Annealing

Published online by Cambridge University Press: 01 January 2025

Michael J. Brusco and

Hans-Friedrich Köhn

Show author details

Michael J. Brusco*: Affiliation:
Florida State University
Hans-Friedrich Köhn: Affiliation:
University of Missouri-Columbia
*: Requests for reprints should be sent to Michael J. Brusco, Department of Marketing, College of Business, Florida State University, Tallahassee, FL 32306-1110, USA. E-mail: mbrusco@cob.fsu.edu

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Several authors have touted the p-median model as a plausible alternative to within-cluster sums of squares (i.e., K-means) partitioning. Purported advantages of the p-median model include the provision of “exemplars” as cluster centers, robustness with respect to outliers, and the accommodation of a diverse range of similarity data. We developed a new simulated annealing heuristic for the p-median problem and completed a thorough investigation of its computational performance. The salient findings from our experiments are that our new method substantially outperforms a previous implementation of simulated annealing and is competitive with the most effective metaheuristics for the p-median problem.

Keywords

cluster analysis partitioning heuristics p-median model simulated annealing

Information

Type: Theory and Methods
Information: Psychometrika , Volume 74 , Issue 3 , September 2009 , pp. 457 - 475

DOI: https://doi.org/10.1007/s11336-009-9115-2 [Opens in a new window]
Copyright: Copyright © 2009 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

An erratum to this article can be found at http://dx.doi.org/10.1007/s11336-009-9140-1

References

Aarts, E., Korst, J. (1989). Simulated annealing and Boltzmann machines: A stochastic approach to combinatorial optimization and neural computing, New York: WileyGoogle Scholar

Alba, E., Dominguez, E. (2006). Comparative analysis of modern optimization tools for the p-median problem. Statistics and Computing, 16, 251–260CrossRef Google Scholar

Alp, O., Erkut, E., Drezner, Z. (2003). An efficient genetic algorithm for the p-median problem. Annals of Operations Research, 122, 21–42CrossRef Google Scholar

Avella, P., Sassano, A., Vasil’ev, I. (2007). Computational study of large-scale p-median problems. Mathematical Programming A, 109, 89–114CrossRef Google Scholar

Beasley, J.E. (1990). OR-Library: Distributing test problems by electronic mail. Journal of the Operational Research Society, 41, 1069–1072CrossRef Google Scholar

Beltran, C., Tadonki, C., & Vial, J. (2006). Solving the p-median problem with a semi-Lagrangian relaxation. Computational Optimization and Applications, June 5, 2006, doi: 10.1007/s10589-006-6513-6.CrossRef Google Scholar

Brusco, M.J., Cradit, J.D., Tashchian, A. (2003). Multicriterion clusterwise regression for joint segmentation settings: An application to customer value. Journal of Marketing Research, 40, 225–234CrossRef Google Scholar

Brusco, M.J., Köhn, H.-F. (2008). Comment on ‘Clustering by passing messages between data points’. Science, 319, 726CrossRef Google Scholar PubMed

Brusco, M.J., Köhn, H.-F. (2008). Optimal partitioning of a data set based on the p-median model. Psychometrika, 73, 89–105CrossRef Google Scholar

Brusco, M.J., Köhn, H.-F., Stahl, S. (2008). Heuristic implementation of dynamic programming for matrix permutation problems in combinatorial data analysis. Psychometrika, 73, 503–522CrossRef Google Scholar

Brusco, M.J., Steinley, D. (2007). A comparison of heuristic procedures for minimum within-cluster sums of squares partitioning. Psychometrika, 72, 583–600CrossRef Google Scholar

Ceulemans, E., Van Mechelen, I. (2008). CLASSI: A classification model for the study of sequential processes and individual differences therein. Psychometrika, 73, 107–124CrossRef Google Scholar

Ceulemans, E., Van Mechelen, I., Leenen, I. (2007). The local minima problem in hierarchical classes analysis: An evaluation of a simulated annealing algorithm and various multistart procedures. Psychometrika, 72, 377–391CrossRef Google Scholar

Chiyoshi, F., Galvão, R.D. (2000). A statistical analysis of simulated annealing applied to the p-median problem. Annals of Operations Research, 96, 61–74CrossRef Google Scholar

Christofides, N., Beasley, J.E. (1982). A tree search algorithm for the p-median problem. European Journal of Operational Research, 10, 196–204CrossRef Google Scholar

Cornuejols, G., Fisher, M.L., Nemhauser, G.L. (1977). Location of bank accounts to optimize float: An analytic study of exact and approximate algorithms. Management Science, 23, 789–810CrossRef Google Scholar

Du Merle, O., & Vial, J.-P. (2002). Proximal-ACCPM, a cutting plane method for column generation and Lagrangian relaxation: application to the p -median problem (Technical report 2002.23). HEC Genève, University of Genève.Google Scholar

Forgy, E.W. (1965). Cluster analyses of multivariate data: Efficiency versus interpretability of classifications. Biometrics, 21, 768Google Scholar

Frey, B., Dueck, D. (2007). Clustering by passing messages between data points. Science, 315, 972–976CrossRef Google Scholar PubMed

Frey, B., Dueck, D. (2008). Response to comment on “Clustering by passing messages between data points”. Science, 319, 726CrossRef Google Scholar

Galvão, R.D. (1980). A dual-bounded algorithm for the p-median problem. Operations Research, 28, 1112–1121CrossRef Google Scholar

Hanjoul, P., Peeters, D. (1985). A comparison of two dual-based procedures for solving the p-median problem. European Journal of Operational Research, 20, 387–396CrossRef Google Scholar

Hansen, P., Mladenović, N. (1997). Variable neighborhood search for the p-median. Location Science, 5, 207–226CrossRef Google Scholar

Hansen, P., Mladenović, N. (2008). Complement to a comparative analysis of heuristics for the p-median problem. Statistics and Computing, 18, 41–46CrossRef Google Scholar

Hansen, P., Mladenović, N., Perez-Brito, D. (2001). Variable neighborhood decomposition search. Journal of Heuristics, 7, 335–350CrossRef Google Scholar

Hartigan, J.A., Wong, M.A. (1979). Algorithm AS136: A k-means clustering program. Applied Statistics, 28, 100–128CrossRef Google Scholar

Howard, R.N. (1966). Classifying a population into homogeneous groups. In Lawrence, J.R. (Eds.), Operational research and social sciences (pp. 585–594). London: TavistockGoogle Scholar

Hubert, L., Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218CrossRef Google Scholar

Kaufman, L., Rousseeuw, P.J. (1990). Finding groups in data: an introduction to cluster analysis, New York: WileyCrossRef Google Scholar

Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P. (1983). Optimization by simulated annealing. Science, 220, 671–680CrossRef Google Scholar PubMed

Klastorin, T. (1985). The p-median problem for cluster analysis: A comparative test using the mixture model approach. Management Science, 31, 84–95CrossRef Google Scholar

Kuehn, A.A., Hamburger, M.J. (1963). A heuristic program for locating warehouses. Management Science, 9, 643–666CrossRef Google Scholar

Levanova, T., Loresh, M.A. (2004). Algorithms of ant system and simulated annealing for the p-median problem. Automation and Remote Control, 65, 431–438CrossRef Google Scholar

Lin, S., Kernighan, B.W. (1973). An effective heuristic algorithm for the traveling salesman problem. Operations Research, 21, 498–516CrossRef Google Scholar

MacQueen, J.B. (1967). Some methods for classification and analysis of multivariate observations. In Le Cam, L.M., Neyman, J. (Eds.), Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (pp. 281–297). Berkeley: University of California PressGoogle Scholar

Maranzana, F.E. (1964). On the location of supply points to minimize transportation costs. Operational Research Quarterly, 15, 261–270CrossRef Google Scholar

Mladenović, N., Brimberg, J., Hansen, P., Moreno-Pérez, J.A. (2007). The p-median problem: A survey of metaheuristic approaches. European Journal of Operational Research, 179, 927–939CrossRef Google Scholar

Moreno-Pérez, J.A., García-Roda, J.L., Moreno-Vega, J.M. (1994). A parallel genetic algorithm for the discrete p-median problem. Studies in Location Analysis, 7, 131–141Google Scholar

Mulvey, J.M., Crowder, H.P. (1979). Cluster analysis: An application of Lagrangian relaxation. Management Science, 25, 329–340CrossRef Google Scholar

Murillo, A., Vera, J.-F., Heiser, W.J. (2005). A permutation-translation simulated annealing algorithm for L ₁ and L ₂ unidimensional scaling. Journal of Classification, 22, 119–138CrossRef Google Scholar

Murray, A.T., Church, R.L. (1996). Applying simulated annealing to location-planning models. Journal of Heuristics, 2, 31–53CrossRef Google Scholar

Narula, S.C., Ogbu, U.I., Samuelsson, H.M. (1977). An algorithm for the p-median problem. Operations Research, 25, 709–713CrossRef Google Scholar

Rao, M.R. (1971). Cluster analysis and mathematical programming. Journal of the American Statistical Association, 66, 622–626CrossRef Google Scholar

Reinelt, G. (2001). TSPLIB. http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95.Google Scholar

Resende, M.G.C., Werneck, R.F. (2003). On the implementation of a swap-based local-search procedure for the p-median problem. In Ladner, R.E. (Eds.), Proceedings of the fifth workshop on algorithm engineering and experiments (pp. 119–127). Philadelphia: SIAMGoogle Scholar

Resende, M.G.C., Werneck, R.F. (2004). A hybrid heuristic for the p-median problem. Journal of Heuristics, 10, 59–88CrossRef Google Scholar

ReVelle, C.S., Swain, R. (1970). Central facilities location. Geographical Analysis, 2, 30–42CrossRef Google Scholar

Rolland, E., Schilling, D.A., Current, J.R. (1996). A efficient tabu search procedure for the p-median problem. European Journal of Operational Research, 96, 329–342CrossRef Google Scholar

Rosing, K.E. (1997). An empirical investigation of the effectiveness of a vertex substitution heuristic. Environment and Planning B, 24, 59–67CrossRef Google Scholar

Rosing, K.E., ReVelle, C.S. (1997). Heuristic concentration: Two stage solution construction. European Journal of Operational Research, 97, 75–86CrossRef Google Scholar

Rosing, K.E., ReVelle, C.S., Rolland, E., Schilling, D.A., Current, J.R. (1998). Heuristic concentration and tabu search: A head to head comparison. European Journal of Operational Research, 104, 93–99CrossRef Google Scholar

Steinhaus, H. (1956). Sur la division des corps matériels en parties. Bulletin de l’Académie Polonaise des Sciences, Classe III Mathématique, Astronomie, Physique, Chimie, Géologie, et Géographie, IV(12), 801–804Google Scholar

Steinley, D. (2004). Properties of the Hubert-Arabie adjusted Rand index. Psychological Methods, 9, 386–396CrossRef Google Scholar PubMed

Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34CrossRef Google Scholar PubMed

Taillard, E.D. (2003). Heuristic methods for large centroid clustering problems. Journal of Heuristics, 9, 51–74CrossRef Google Scholar

Teitz, M.B., Bart, P. (1968). Heuristic methods for estimating the generalized vertex median of a weighted graph. Operations Research, 16, 955–961CrossRef Google Scholar

Thorndike, R.L. (1953). Who belongs in the family?. Psychometrika, 18, 267–276CrossRef Google Scholar

van Laarhoven, P.J.M., Aarts, E.H.L. (1987). Simulated annealing: Theory and applications, Dordrecht: KluwerCrossRef Google Scholar

Vera, J.-F., Heiser, W.J., Murillo, A. (2007). Global optimization in any Minkowski Metric: A permutation-translation simulated annealing algorithm for multidimensional scaling. Journal of Classification, 24, 277–301CrossRef Google Scholar

Vinod, H. (1969). Integer programming and the theory of grouping. Journal of the American Statistical Association, 64, 506–517CrossRef Google Scholar

Whitaker, R. (1983). A fast algorithm for the greedy interchange of large-scale clustering and median location problems. INFOR, 21, 95–108Google Scholar

Erratum to: Exemplar-Based Clustering via Simulated Annealing

Michael J. Brusco and Hans-Friedrich Köhn

Psychometrika , Volume 74 , Issue 4

Article contents

Exemplar-Based Clustering via Simulated Annealing

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

A correction has been issued for this article:

Linked content

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

Exemplar-Based Clustering via Simulated Annealing

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

A correction has been issued for this article:

Linked content

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests