Community structure: A comparative evaluation of community detection methods

Vinh Loc Dao; Cécile Bothorel; Philippe Lenca

doi:10.1017/nws.2019.59

Community structure: A comparative evaluation of community detection methods

Published online by Cambridge University Press: 03 January 2020

and

Vinh Loc Dao: Affiliation:
IMT Atlantique, Lab-STICC CNRS UMR 6285, F-29238, Brest, France, (e-mails: loc.daovinh@gmail.com; philippe.lenca@imt-atlantique.fr)
Cécile Bothorel*: Affiliation:
IMT Atlantique, Lab-STICC CNRS UMR 6285, F-29238, Brest, France, (e-mails: loc.daovinh@gmail.com; philippe.lenca@imt-atlantique.fr)
Philippe Lenca: Affiliation:
IMT Atlantique, Lab-STICC CNRS UMR 6285, F-29238, Brest, France, (e-mails: loc.daovinh@gmail.com; philippe.lenca@imt-atlantique.fr)
*: *Corresponding author. Email: cecile.bothorel@imt-atlantique.fr

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Discovering community structure in complex networks is a mature field since a tremendous number of community detection methods have been introduced in the literature. Nevertheless, it is still very challenging for practitioners to determine which method would be suitable to get insights into the structural information of the networks they study. Many recent efforts have been devoted to investigating various quality scores of the community structure, but the problem of distinguishing between different types of communities is still open. In this paper, we propose a comparative, extensive, and empirical study to investigate what types of communities many state-of-the-art and well-known community detection methods are producing. Specifically, we provide comprehensive analyses on computation time, community size distribution, a comparative evaluation of methods according to their optimization schemes as well as a comparison of their partitioning strategy through validation metrics. We process our analyses on a very large corpus of hundreds of networks from five different network categories and propose ways to classify community detection methods, helping a potential user to navigate the complex landscape of community detection.

Keywords

community detection community structure comparative analysis empirical analysis computation time community size structural quality function validation metric decision-making assistance for practitioners

Information

Type: Research Article
Information: Network Science , Volume 8 , Issue 1 , March 2020 , pp. 1 - 41

DOI: https://doi.org/10.1017/nws.2019.59 [Opens in a new window]
Copyright: © Cambridge University Press 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Agreste, S., Meo, P. D., Fiumara, G., Piccione, G., Piccolo, S., Rosaci, D., Sarne, G. M. L., & Vasilakos, A. V. (2017). An empirical comparison of algorithms to find communities in directed graphs and their application in web data analytics. IEEE Transactions on Big Data, 3(3), 289–306.CrossRef Google Scholar

Aldecoa, R., & Marn, I. (2011). Deciphering network community structure by surprise. PLoS ONE, 6(9), e24195.CrossRef Google Scholar PubMed

Ames, B. P. W. (2013). Guaranteed clustering and biclustering via semidefinite programming. Mathematical Programming, 147(1–2), 429–465.CrossRef Google Scholar

Ana, L. N. F., & Jain, A. K. (2003). Robust data clustering. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Proceedings.CrossRef Google Scholar

Arifin, S., Zulkardi, , Putri, R. I. I., Hartono, Y., & Susanti, E. (2017). Developing Ill-defined problem-solving for the context of South Sumatera. Journal of Physics: Conference Series, 943(dec), 012038.Google Scholar

Blondel, V. D, Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.CrossRef Google Scholar

Bohlin, L., Edler, D., Lancichinetti, A., & Rosvall, M. (2014). Community detection and visualization of networks with the map equation framework. In Measuring scholarly impact (pp. 3–34). Springer International Publishing.Google Scholar

Chakraborty, T., Dalmia, A., Mukherjee, A., & Ganguly, N. (2017). Metrics for community analysis: A survey. ACM Computing Surveys, 50(4), 1–37.CrossRef Google Scholar

Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6).CrossRef Google Scholar PubMed

Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74(368), 829–836.CrossRef Google Scholar

Coscia, M., Giannotti, F., & Pedreschi, D. (2011). A classification for community discovery methods in complex networks. Statistical Analysis and Data Mining, 4(5), 512–546.CrossRef Google Scholar

Danon, L., Daz-Guilera, A., Duch, J., & Arenas, A. (2005). Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005(09), P09008–P09008.CrossRef Google Scholar

Dao, V. L., Bothorel, C., & Lenca, P. (2017). Community structures evaluation in complex networks: A descriptive approach. In The 3rd international winter school and conference on network science, NetSciX, 11–19.CrossRef Google Scholar

Dao, V. L., Bothorel, C., & Lenca, P. (2018a). An empirical characterization of community structures in complex networks using a bivariate map of quality metrics. Arxiv e-prints, June.Google Scholar

Dao, V. L., Bothorel, C., & Lenca, P. (2018b). Estimating the similarity of community detection methods based on cluster size distribution. In The 7th international conference on complex networks and their applications. Springer International Publishing.CrossRef Google Scholar

Erdös, P., & Rényi, A. (1959). On random graphs, I. Publicationes mathematicae (debrecen), 6, 290–297.Google Scholar

Fortunato, S., & Barthelemy, M. (2006). Resolution limit in community detection. Proceedings of the National Academy of Sciences, 104(1), 36–41.CrossRef Google Scholar PubMed

Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174.CrossRef Google Scholar

Fortunato, S., & Hric, D. (2016). Community detection in networks: A user guide. Physics Reports, 659(Nov.), 1–44.CrossRef Google Scholar

Ghasemian, A., Hosseinmardi, H., & Clauset, A. (2018). Evaluating overfit and underfit in models of network community structure. Arxiv e-prints, Feb.Google Scholar

Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.CrossRef Google Scholar PubMed

Holland, P. W, Laskey, K. B., & Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks, 5(2), 109–137.CrossRef Google Scholar

Hric, D., Darst, R. K., & Fortunato, S. (2014). Community detection in networks: Structural communities versus ground truth. Physical Review E, 90(6).CrossRef Google Scholar PubMed

Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.CrossRef Google Scholar

Jebabli, M., Cherifi, H., Cherifi, C., & Hamouda, A. (2015). Overlapping community detection versus ground-truth in amazon co-purchasing network. In 2015 11th international conference on signal-image technology & internet-based systems (SITIS) (pp. 328–336). IEEE.CrossRef Google Scholar

Jebabli, M., Cherifi, H., Cherifi, C., & Hamouda, A. (2018). Community detection algorithm evaluation with ground-truth data. Physica A: Statistical Mechanics and Its Applications, 492, 651–706.CrossRef Google Scholar

Jerome, K. (2013). The koblenz network collection. In Proceedings conference on world wide web companion (pp. 1343–1350).Google Scholar

Jr.Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the american statistical association, 58(301), 236–244.CrossRef Google Scholar

Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79–86.CrossRef Google Scholar

Kuncheva, L. I., & Hadjitodorov, S. T. (2004). Using diversity in cluster ensembles. In IEEE international conference on systems, man and cybernetics.CrossRef Google Scholar

Lambiotte, R. (2010). Multi-scale modularity in complex networks. In 8th international symposium on modeling and optimization in mobile, ad hoc, and wireless networks, May (pp. 546–553).Google Scholar

Lancichinetti, A., Kivelä, M., Saramäki, J., & Fortunato, S. (2010). Characterizing the community structure of complex networks. PLoS ONE, 5(8), e11976.CrossRef Google Scholar PubMed

Lancichinetti, A., Radicchi, F., Ramasco, J. J., & Fortunato, S. (2011). Finding statistically significant communities in networks. PLoS ONE, 6(4), e18961.CrossRef Google Scholar PubMed

Leskovec, J., & Krevl, A. (2014). SNAP Datasets: Stanford large network dataset collection. June.Google Scholar

Leskovec, J., Lang, K. J., Dasgupta, A., & Mahoney, M. W. (2008). Statistical properties of community structure in large social and information networks. Proceeding of the 17th international conference on world wide web – WWW 08.CrossRef Google Scholar

Li, Z., Zhang, S., Wang, R.-S., Zhang, X.-S., & Chen, L. (2008). Quantitative function for community detection. Physical Review E, 77(3).CrossRef Google Scholar PubMed

Meilă, M. (2003). Comparing clusterings by the variation of information. Learning Theory and Kernel Machines, 173–187.CrossRef Google Scholar

Meo, P. D., Ferrara, E., Fiumara, G., & Provetti, A. (2014). Mixing local and global information for community detection in large networks. Journal of Computer and System Sciences, 80(1), 72–87.CrossRef Google Scholar

Miyauchi, A., & Kawase, Y. (2016). Z-score-based modularity for community detection in networks. PLOS ONE, 11(1), e0147805.CrossRef Google Scholar PubMed

Newman, M. E. J. (2006a). Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(Sep), 036104.CrossRef Google Scholar PubMed

Newman, M. E. J. (2006b). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582.CrossRef Google Scholar PubMed

Newman, M. E. J. (2010). Networks: An Introduction. Oxford University Press.CrossRef Google Scholar

Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2).Google Scholar PubMed

Orman, G. K., Labatut, V., & Cherifi, H. (2012). Comparative evaluation of community detection algorithms: A topological approach. Journal of Statistical Mechanics: Theory and Experiment, 2012(08), P08001.CrossRef Google Scholar

Papadopoulos, S., Kompatsiaris, Y., Vakali, A., & Spyridonos, P. (2011). Community detection in social media. Data Mining and Knowledge Discovery, 24(3), 515–554.CrossRef Google Scholar

Peel, L., Larremore, D. B., & Clauset, A. (2017). The ground truth about metadata and community detection in networks. Science Advances, 3(5), e1602548.CrossRef Google Scholar PubMed

Pons, P., & Latapy, M. (2005). Computing communities in large networks using random walks. In Yolum, P., Güngör, T., Gürgen, F., & Özturan, C. (Eds.) Computer and information sciences – ISCIS 2005 (pp. 284–293). Berlin, Heidelberg: Springer.CrossRef Google Scholar

Pons, P., & Latapy, M. (2011). Post-processing hierarchical community structures: Quality improvements and multi-scale view. Theoretical Computer Science, 412(8–10), 892–900.CrossRef Google Scholar

Porter, M. A., Onnela, J.-P., & Mucha, P. J. (2009). Communities in Networks. Notices of the American Mathematical Society, Feb.Google Scholar

Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., & Parisi, D. (2004). Defining and identifying communities in networks. Proceedings of the National Academy of Sciences, 101(9), 2658–2663.CrossRef Google Scholar PubMed

Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3).CrossRef Google Scholar PubMed

Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846–850.CrossRef Google Scholar

Reichardt, J., & Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review E, 74(1).CrossRef Google Scholar PubMed

Riolo, M. A., Cantwell, G. T., Reinert, G., & Newman, M. E. J. (2017). Efficient method for estimating the number of communities in a network. Physical Review E, 96(3).CrossRef Google Scholar PubMed

Rossi, R. A., & Ahmed, N. K. (2015). The network data repository with interactive graph analytics and visualization. In Proceedings of the twenty-ninth aaai conference on artificial intelligence.Google Scholar

Rosvall, M., & Bergstrom, C. T. (2007). An information-theoretic framework for resolving community structure in complex networks. Proceedings of the National Academy of Sciences, 104(18), 7327–7331.CrossRef Google Scholar PubMed

Rosvall, M., Axelsson, D., & Bergstrom, C. T. (2009). The map equation. European Physical Journal Special Topics, 178(Nov.), 13–23.CrossRef Google Scholar

Schaub, M. T., Delvenne, J.-C., Rosvall, M., & Lambiotte, R. (2017). The many facets of community detection in complex networks. Applied Network Science, 2(1).CrossRef Google Scholar PubMed

Silverman, B. W. (1986). Density estimation for statistics and data analysis. London, New York: Chapman and Hall.CrossRef Google Scholar

Traag, V. A., Dooren, P. Van, & Nesterov, Y. (2011). Narrow scope for resolution-limit-free community detection. Physical Review E, 84(1).CrossRef Google Scholar PubMed

Traag, V. A., Aldecoa, R., & Delvenne, J.-C. (2015). Detecting communities using asymptotical surprise. Physical Review E, 92(2).CrossRef Google Scholar PubMed

Vinh, N. X., Epps, J., & Bailey, J. (2010). Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res., 11(Dec.), 2837–2854.Google Scholar

Wasserman, S. (1994). Social network analysis: Methods and applications. Cambridge, New York: Cambridge University Press.CrossRef Google Scholar

Xie, J., & Szymanski, B. K. (2012). Towards linear time overlapping community detection in social networks. In Advances in knowledge discovery and data mining (pp. 25–36). Berlin, Heidelberg: Springer.CrossRef Google Scholar

Yang, J., & Leskovec, J. (2013). Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42(1), 181–213.CrossRef Google Scholar

Article contents

Community structure: A comparative evaluation of community detection methods

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests