Published online by Cambridge University Press: 03 April 2014
Time plays an essential role in the diffusion of information, influence, and disease over networks. In many cases we can only observe when a node is activated by a contagion—when a node learns about a piece of information, makes a decision, adopts a new behavior, or becomes infected with a disease. However, the underlying network connectivity and transmission rates between nodes are unknown. Inferring the underlying diffusion dynamics is important because it leads to new insights and enables forecasting, as well as influencing or containing information propagation. In this paper we model diffusion as a continuous temporal process occurring at different rates over a latent, unobserved network that may change over time. Given information diffusion data, we infer the edges and dynamics of the underlying network. Our model naturally imposes sparse solutions and requires no parameter tuning. We develop an efficient inference algorithm that uses stochastic convex optimization to compute online estimates of the edges and transmission rates. We evaluate our method by tracking information diffusion among 3.3 million mainstream media sites and blogs, and experiment with more than 179 million different instances of information spreading over the network in a one-year period. We apply our network inference algorithm to the top 5,000 media sites and blogs and report several interesting observations. First, information pathways for general recurrent topics are more stable across time than for on-going news events. Second, clusters of news media sites and blogs often emerge and vanish in a matter of days for on-going news events. Finally, major events, for example, large scale civil unrest as in the Libyan civil war or Syrian uprising, increase the number of information pathways among blogs, and also increase the network centrality of blogs and social media sites.