Abstract
Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such “exemplars” can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initial choice is close to a good solution. We devised a method called “affinity propagation,” which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. We used affinity propagation to cluster images of faces, detect genes in microarray data, identify representative sentences in this manuscript, and identify cities that are efficiently accessed by airline travel. Affinity propagation found clusters with much lower error than other methods, and it did so in less than one-hundredth the amount of time.
References
23
Referenced
5,288
- J. MacQueen, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, L. Le Cam, J. Neyman, Eds. (Univ. of California Press, Berkeley, CA, 1967), vol. 1, pp. 281–297. / Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (1967)
- Supporting material is available on Science Online.
- Software implementations of affinity propagation along with the data sets and similarities used to obtain the results described in this manuscript are available at www.psi.toronto.edu/affinitypropagation.
10.1038/ng1630
10.1093/nar/gkg111
- C. D. Manning H. Schutze Foundations of Statistical Natural Language Processing (MIT Press Cambridge MA 1999).
10.1073/pnas.79.8.2554
10.1006/jcss.2002.1882
10.1109/TIT.2005.850085
10.1109/18.910572
-
A. P. Dempster, N. M. Laird, D. B. Rubin, Proc. R. Stat. Soc. B39, 1 (1977).
(
10.1111/j.2517-6161.1977.tb01600.x
) / Proc. R. Stat. Soc. B (1977) - S. Dasgupta, L. J. Schulman, Proc. 16th Conf. UAI (Morgan Kaufman, San Francisco, CA, 2000), pp. 152–159. / Proc. 16th Conf. UAI (2000)
10.1198/1061860043001
- R. R. Sokal, C. D. Michener, Univ. Kans. Sci. Bull.38, 1409 (1958). / Univ. Kans. Sci. Bull. (1958)
10.1109/34.868688
10.1126/science.1086309
- J. Pearl Probabilistic Reasoning in Intelligent Systems (Morgan Kaufman San Mateo CA 1988).
10.1109/18.748992
10.1109/26.539767
10.1126/science.1073287
- B. J. Frey, R. Koetter, N. Petrovic, in Proc. 14th Conf. NIPS (MIT Press, Cambridge, MA, 2002), pp. 737–743. / Proc. 14th Conf. NIPS (2002)
- T. Meltzer, C. Yanover, Y. Weiss, in Proc. 10th Conf. ICCV (IEEE Computer Society Press, Los Alamitos, CA, 2005), pp. 428–435. / Proc. 10th Conf. ICCV (2005)
- We thank B. Freeman G. Hinton R. Koetter Y. LeCun S. Roweis and Y. Weiss for helpful discussions and P. Dayan G. Hinton D. MacKay M. Mezard S. Roweis and C. Tomasi for comments on a previous draft of this manuscript. We acknowledge funding from Natural Sciences and Engineering Research Council of Canada Genome Canada/Ontario Genomics Institute and the Canadian Institutes of Health Research. B.J.F. is a Fellow of the Canadian Institute for Advanced Research.
Dates
Type | When |
---|---|
Created | 18 years, 7 months ago (Jan. 11, 2007, 8:49 p.m.) |
Deposited | 7 months, 2 weeks ago (Jan. 12, 2025, 5:32 p.m.) |
Indexed | 21 minutes ago (Aug. 27, 2025, 8:11 p.m.) |
Issued | 18 years, 6 months ago (Feb. 16, 2007) |
Published | 18 years, 6 months ago (Feb. 16, 2007) |
Published Print | 18 years, 6 months ago (Feb. 16, 2007) |
@article{Frey_2007, title={Clustering by Passing Messages Between Data Points}, volume={315}, ISSN={1095-9203}, url={http://dx.doi.org/10.1126/science.1136800}, DOI={10.1126/science.1136800}, number={5814}, journal={Science}, publisher={American Association for the Advancement of Science (AAAS)}, author={Frey, Brendan J. and Dueck, Delbert}, year={2007}, month=feb, pages={972–976} }