Uncovering the structure and temporal dynamics of information propagation

MANUEL GOMEZ RODRIGUEZ; JURE LESKOVEC; DAVID BALDUZZI; BERNHARD SCHÖLKOPF

doi:10.1017/nws.2014.3

Uncovering the structure and temporal dynamics of information propagation

Published online by Cambridge University Press: 03 April 2014

MANUEL GOMEZ RODRIGUEZ ,

JURE LESKOVEC ,

DAVID BALDUZZI and

BERNHARD SCHÖLKOPF

Show author details

MANUEL GOMEZ RODRIGUEZ: Affiliation:
Department of Empirical Inference, MPI for Intelligent Systems, Tübingen, Baden-Wurttemberg, Germany (e-mail: manuelgr@tuebingen.mpg.de)
JURE LESKOVEC: Affiliation:
Department of Computer Science, Stanford University, Stanford, CA, USA (e-mail: jure@cs.stanford.edu)
DAVID BALDUZZI: Affiliation:
Machine Learning Laboratory, ETH Zürich, Zürich, Switzerland (e-mail: david.balduzzi@inf.ethz.ch)
BERNHARD SCHÖLKOPF: Affiliation:
Department of Empirical Inference, MPI for Intelligent Systems, Tübingen, Baden-Wurttemberg, Germany (e-mail: bs@tuebingen.mpg.de)

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Time plays an essential role in the diffusion of information, influence, and disease over networks. In many cases we can only observe when a node is activated by a contagion—when a node learns about a piece of information, makes a decision, adopts a new behavior, or becomes infected with a disease. However, the underlying network connectivity and transmission rates between nodes are unknown. Inferring the underlying diffusion dynamics is important because it leads to new insights and enables forecasting, as well as influencing or containing information propagation. In this paper we model diffusion as a continuous temporal process occurring at different rates over a latent, unobserved network that may change over time. Given information diffusion data, we infer the edges and dynamics of the underlying network. Our model naturally imposes sparse solutions and requires no parameter tuning. We develop an efficient inference algorithm that uses stochastic convex optimization to compute online estimates of the edges and transmission rates. We evaluate our method by tracking information diffusion among 3.3 million mainstream media sites and blogs, and experiment with more than 179 million different instances of information spreading over the network in a one-year period. We apply our network inference algorithm to the top 5,000 media sites and blogs and report several interesting observations. First, information pathways for general recurrent topics are more stable across time than for on-going news events. Second, clusters of news media sites and blogs often emerge and vanish in a matter of days for on-going news events. Finally, major events, for example, large scale civil unrest as in the Libyan civil war or Syrian uprising, increase the number of information pathways among blogs, and also increase the network centrality of blogs and social media sites.

Keywords

diffusion networks information cascades information propagation meme tracking information networks social networks news media blogs

Information

Type: Research Article
Information: Network Science , Volume 2 , Issue 1 , April 2014 , pp. 26 - 65

DOI: https://doi.org/10.1017/nws.2014.3 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aalen, O. O., Borgan, Ø., & Gjessing, H. K. (2008). Survival and event history analysis: A process point of view. New York, NY: Springer-Verlag.CrossRef Google Scholar

Adar, E., & Adamic, L. A. (2005). Tracking information epidemics in blogspace. In Proceedings of The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (pp. 207–214). Washington, DC: IEEE.Google Scholar

Agarwal, A., & Duchi, J. C. (2011). Distributed delayed stochastic optimization. In Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., & Weinberger, K. (Eds.), Advances in Neural Information Processing Systems 24 (NIPS-24) (pp. 451–459). NIPS Foundation.Google Scholar

Aral, S., & Walker, D. (2012). Identifying influential and susceptible members of social networks. Science, 337 (6092), 337–341.CrossRef Google Scholar PubMed

Bach, F., & Moulines, E. (2011). Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., & Weinberger, K. (Eds.), Advances in Neural Information Processing Systems 24 (NIPS-24) (pp. 451–459). NIPS Foundation.Google Scholar

Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509–512.CrossRef Google Scholar PubMed

Blatt, D., Hero, A. O., & Gauchman, H. (2008). A convergent incremental gradient method with a constant step size. SIAM Journal on Optimization, 18 (1), 29–51.CrossRef Google Scholar

Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Brockmann, D., Hufnagel, L., & Geisel, T. (2006). The scaling laws of human travel. Nature, 439 (7075), 462–465.CrossRef Google Scholar PubMed

Chierichetti, F., Kleinberg, J., & Liben-Nowell, D. (2011). Reconstructing patterns of information diffusion from incomplete observations. In Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., & Weinberger, K. (Eds.), Advances in Neural Information Processing Systems 24 (NIPS-24) (pp. 792–800). NIPS Foundation.Google Scholar

Clauset, A., Moore, C., & Newman, M. E. J. (2008). Hierarchical structure and the prediction of missing links in networks. Nature, 453 (7191), 98–101.CrossRef Google Scholar PubMed

Du, N., Song, L., Gomez-Rodriguez, M., & Zha, H. (2013). Scalable influence estimation in continuous-time diffusion networks. In Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z., & Weinberger, K. Q. (Eds.), Advances in Neural Information Processing Systems 25 (NIPS-25) (pp. 2789–2797). NIPS Foundation.Google Scholar

Du, N., Song, L., Smola, A., & Yuan, M. (2012). Learning networks of heterogeneous influence. In Barlett, P., Pereira, F. C. N., Burges, C. J. C., Bottou, L., & Weinberger, K. Q. (Eds.), Advances in Neural Information Processing Systems 25 (NIPS-25) (pp. 2789–2797). NIPS Foundation.Google Scholar

Duchi, J. C., Agarwal, A., Johansson, M., & Jordan, M. I. (2011). Ergodic subgradient descent. Proceedings of the 40th Annual Allerton Conference on Communication, Control, and Computing (pp. 701–706). IEEE.Google Scholar

Erdős, P., & Rényi, A. (1960). On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Science Series B, 5, 17–67.Google Scholar

Gomez-Rodriguez, M., Leskovec, J., & Krause, A. (2010). Inferring networks of diffusion and influence. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (pp. 1019–1028). ACM.Google Scholar

Gomez-Rodriguez, M., Leskovec, J., & Krause, A. (2012). Inferring networks of diffusion and influence. ACM Transactions on Knowledge Discovery from Data, 5 (4), 21:1–37, ACM.CrossRef Google Scholar

Gomez-Rodriguez, M., & Schölkopf, B. (2012a). Influence maximization in continuous time diffusion networks. In Langford, John & Pineau, Joelle (Eds.), Proceedings of the 29th International Conference on Machine Learning (pp. 313–320). Omnipress.Google Scholar

Gomez-Rodriguez, M., & Schölkopf, B. (2012b). Modeling information propagation with survival theory. In Bartlett, P., Pereira, F. C. N., Burges, C. J. C., Bottou, L., & Weinberger, K. Q. (Eds.), Advances in Neural Information Processing Systems 25 (NIPS-25): Workshop in Algorithmic and Statistical Approaches for Large Social Networks. NIPS Foundation.Google Scholar

Gomez-Rodriguez, M., & Schölkopf, B. (2012c). Submodular inference of diffusion networks from multiple trees. In Langford, John & Pineau, Joelle (Eds.), Proceedings of the 29th International Conference on Machine Learning (pp. 489–496). Omnipress.Google Scholar

Grant, M., & Boyd, S. (2010). CVX: Matlab software for disciplined convex programming, version 1.21. Retrieved from http://cvxr.com/cvx (2013).Google Scholar

Hufnagel, L., Brockmann, D., & Geisel, T. (2004). Forecast and control of epidemics in a globalized world. Proceedings of the National Academy of Sciences of the United States of America, 101 (42), 15124.CrossRef Google Scholar

InfoPath. (2013). InfoPath. Retrieved from http://snap.stanford.edu/infopath/ (2013).Google Scholar

Kaplan, E. H. (1989). What are the risks of risky sex? Modeling the AIDS epidemic. Operations Research, 37 (2), 198–209.CrossRef Google Scholar

Kempe, D., Kleinberg, J. M. & Tardos, É. (2003). Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 137–146). ACM.Google Scholar

Lappas, T., Terzi, E., Gunopulos, D., & Mannila, H. (2010). Finding effectors in social networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1059–1068). ACM.CrossRef Google Scholar

Lawless, J. F. (1982). Statistical models and methods for lifetime data. New York, NY: Wiley.Google Scholar

Leskovec, J., Adamic, L. A., & Huberman, B. A. (2006). The dynamics of viral marketing. In Proceedings of the 7th ACM Conference on Electronic Commerce (pp. 228–237). ACM.CrossRef Google Scholar

Leskovec, J., Backstrom, L., & Kleinberg, J. (2009). Meme-tracking and the dynamics of the news cycle. In Proceedings of the 15th ACM Sigkdd International Conference on Knowledge Discovery and Data Mining (pp. 497–506). New York, NY: ACM.CrossRef Google Scholar

Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., & Ghahramani, Z. (2010). Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research, 11, 985–1042.Google Scholar

Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., & Glance, N. (2007b). Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM Sigkdd International Conference on Knowledge Discovery and Data Mining (pp. 420–429).CrossRef Google Scholar

Leskovec, J., Lang, K. J., Dasgupta, A., & Mahoney, M. W. (2008). Statistical properties of community structure in large social and information networks. In Proceedings of the 17th International Conference on World Wide Web (pp. 695–704). ACM.CrossRef Google Scholar

Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N., & Hurst, M. (2007a). Cascading behavior in large blog graphs. In Proceedings of the SIAM Conference on Data Mining (pp. 551–556). SIAM.Google Scholar

Liben-Nowell, D., & Kleinberg, J. (2008). Tracing the flow of information on a global scale using Internet chain-letter data. Proceedings of the National Academy of Sciences, 105 (12), 4633–4638.CrossRef Google Scholar PubMed

Lipsitch, M., Cohen, T., Cooper, B., Robins, J. M., Ma, S., James, L., . . . Murray, M. (2003). Transmission dynamics and control of severe acute respiratory syndrome. Science, 300 (5627), 1966.CrossRef Google Scholar PubMed

Merriam-Webster's collegiate dictionary. 2004. Springfield, MA: Merriam-Webster.Google Scholar

Miller, M., Sathi, C., Wiesenthal, D., Leskovec, J., & Potts, C. (2011). Sentiment flow through hyperlink networks. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media. AAAI.Google Scholar

Myers, S., & Leskovec, J. (2010). On the convexity of latent social network inference. In Lafferty, J., Williams, C. K. I., Shawe-Taylor, J., Zemel, R. S., & Culotta, A. (Eds.), Advances in Neural Information Processing Systems 23 (NIPS-23) (pp. 1741–1749). NIPS Foundation.Google Scholar

Myers, S., & Leskovec, J. (2012). Clash of the contagions: Cooperation and competition in information diffusion. Proceedings of the IEEE International Conference on Data Mining (pp. 539–548). IEEE.Google Scholar

Myers, S., Leskovec, J., & Zhu, C. (2012). Information diffusion and external influence in networks. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 33–41). ACM.Google Scholar

Nemirovski, A., Juditsky, A., Lan, G., & Shapiro, A. (2009). Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19 (4), 1574.CrossRef Google Scholar

Netrapalli, P., & Sanghavi, S. (2012). Finding the graph of epidemic cascades. In Proceedings of the 12th ACM SIGMETRICS/Performance. PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (pp. 211–222). ACM.Google Scholar

NetRate. (2011). NetRate. Retrieved from http://people.tue.mpg.de/manuelgr/netrate/ (2013).Google Scholar

Newey, W. K., & McFadden, D. L. (1994). Large sample estimation and hypothesis testing. In Engle, R. F. & McFadden, D. L. (Eds.), Handbook of econometrics, Vol. IV (pp. 2111–2245). Amsterdam, Netherlands: Elsevier Science B.V.Google Scholar

Prakash, B. A., Beutel, A., Rosenfeld, R., & Faloutsos, C. (2012). Winner takes all: Competing viruses or ideas on fair-play networks. In Proceedings of the 21st International Conference on World Wide Web (pp. 1037–1046). ACM.CrossRef Google Scholar

Robbins, H., & Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics, 22 (3), 400–407.CrossRef Google Scholar

Rogers, E. M. (1995). Diffusion of innovations (4th ed.). New York, NY: Free Press.Google Scholar PubMed

Romero, D. M., Meeder, B., & Kleinberg, J. (2011). Differences in the mechanics of information diffusion across topics: Idioms, political hashtags, and complex contagion on twitter. In Proceedings of the 20th International Conference on World Wide Web (pp. 695–704). ACM.CrossRef Google Scholar

Roux, N. L., Schmidt, M., & Bach, F. (2012). A stochastic gradient method with an exponential convergence rate for strongly-convex optimization with finite training sets. In Bartlett, P., Pereira, F. C. N., Burges, C. J. C., Bottou, L., & Weinberger, K. Q. (Eds.), Advances in Neural Information Processing Systems 25 (NIPS-25) (pp. 2672–2680). NIPS Foundation.Google Scholar

Sadikov, S., Medina, M., Leskovec, J., & Garcia-Molina, H. (2011). Correcting for missing data in information cascades. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (pp. 55–64). ACM.CrossRef Google Scholar

SNAP. (2012). SNAP: Stanford network analysis platform. Retrieved from http://snap.stanford.edu (2013).Google Scholar

Snowsill, T. M., Fyson, N., De Bie, T., & Cristianini, N. (2011). Refining causality: Who copied from whom? In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 466–474). ACM.CrossRef Google Scholar

Vu, D. Q., Asuncion, A. U., Hunter, D. R., & Smyth, P. (2011). Continuous-time regression models for longitudinal networks. In Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., & Weinberger, K. (Eds.), Advances in Neural Information Processing Systems 24 (NIPS-24) (pp. 2492–2500). NIPS Foundation.Google Scholar

Wallinga, J., & Teunis, P. (2004). Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology, 160 (6), 509–516.CrossRef Google Scholar PubMed

Wang, C., Knight, J. C., & Elder, M. C. (2000). On computer viral infection and the effect of immunization. In Proceedings of the 16th Annual Conference on Computer Security Applications (pp. 246–256). IEEE Computer Society.Google Scholar

Watts, D. J., & Dodds, P. S. (2007). Influentials, networks, and public opinion formation. Journal of Consumer Research, 34 (4), 441–458.CrossRef Google Scholar

Article contents

Uncovering the structure and temporal dynamics of information propagation

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests