Hostname: page-component-78c5997874-j824f Total loading time: 0 Render date: 2024-11-10T14:47:40.977Z Has data issue: false hasContentIssue false

Data-driven traffic and diffusion modeling in peer-to-peer networks: A real case study

Published online by Cambridge University Press:  18 November 2014

ROMAIN HOLLANDERS
Affiliation:
UCLouvain – Université catholique de Louvain / INMA, Avenue G. Lemaître 4, 1348 Louvain-la-Neuve, Belgium (e-mail: romain.hollanders@uclouvain.be)
DANIEL F. BERNARDES
Affiliation:
LIP6 – CNRS and Université Pierre et Marie Curie / Paris 6, Place Jussieu, 75252 Paris cedex 05, France (e-mail: daniel.bernardes@lip6.fr)
BIVAS MITRA
Affiliation:
UCLouvain – Université catholique de Louvain / INMA, Avenue G. Lemaître 4, 1348 Louvain-la-Neuve, Belgium Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, 721302, India (e-mail: bivas@cse.iitkgp.ernet.in)
RAPHAËL M. JUNGERS
Affiliation:
UCLouvain – Université catholique de Louvain / INMA, Avenue G. Lemaître 4, 1348 Louvain-la-Neuve, Belgium (e-mail: raphael.jungers@uclouvain.be) F.R.S./FNRS Research Associate, Rue d'Egmont 5, 1000 Bruxelles, Belgium
JEAN-CHARLES DELVENNE
Affiliation:
UCLouvain – Université catholique de Louvain / INMA, Avenue G. Lemaître 4, 1348 Louvain-la-Neuve, Belgium (e-mail: jean-charles.delvenne@uclouvain.be)
FABIEN TARISSAN
Affiliation:
LIP6 – CNRS and Université Pierre et Marie Curie / Paris 6, Place Jussieu, 75252 Paris cedex 05, France (e-mail: fabien.tarissan@lip6.fr)

Abstract

Peer-to-peer systems have driven a lot of attention in the past decade as they have become a major source of Internet traffic. The amount of data flowing through the peer-to-peer network is huge and hence challenging both to comprehend and to control. In this work, we take advantage of a new and rich dataset recording the peer-to-peer activity at a remarkable scale to address these difficult problems. After extracting the relevant and measurable properties of the network from the data, we develop two models that aim to make the link between the low-level properties of the network, such as the proportion of peers that do not share content (i.e., free riders) or the distribution of the files among the peers, and its high-level properties, such as the Quality of Service or the diffusion of content, which are of interest for supervision and control purposes. We observe a significant agreement between the high-level properties measured on the real data and on the synthetic data generated by our models, which is encouraging for our models to be used in practice as large-scale prediction tools. Relying on them, we demonstrate that spending efforts to reduce the amount of free riders indeed helps to improve the availability of files on the network. We observe however a saturation of this phenomenon after 60% of free riders.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2014 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aidouni, F., Latapy, M., & Magnien, C. (2009). Ten weeks in the life of an eDonkey server. 23rd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2009, (Rome, Italy, May 23–29) pp. 1–5.Google Scholar
Andersson, H., & Britton, T. (2000). Stochastic epidemic models and their statistical analysis, Lecture Notes in Statistics, volume 151 (1st ed.). Springer.Google Scholar
Azzouna, N. B., & Guillemin, F. (2003). Analysis of adsl traffic on an IP backbone link. Global Telecommunications Conference, GLOBECOM '03, volume 1, IEEE, pp. 3742–3746.CrossRefGoogle Scholar
Ban, T., Guo, S., Zhang, Z., Ando, R., & Kadobayashi, Y. (2011). Practical network traffic analysis in p2p environment. Proceedings of the 7th International Conference on Wireless Communications and Mobile Computing Conference, IWCMC, IEEE, pp. 1801–1807.CrossRefGoogle Scholar
Barrat, A., Barthlemy, M., & Vespignani, A. (2008). Dynamical processes on complex networks. New York, NY, USA: Cambridge University Press.Google Scholar
Bernardes, D. F., Latapy, M., & Tarissan, F. (2012). Relevance of SIR model for real-world spreading phenomena: Experiments on a large-scale p2p system. Proceedings of the International Conference on Advances in Social Networks Analysis and Mining, ASONAM, IEEE. Istanbul, Turkey; 2012-08-26 – 2012-08-29.Google Scholar
Clevenot, F., & Nain, P. (2004). A simple fluid model for the analysis of the squirrel peer-to-peer caching system. 23rd AnnualJoint Conference of the IEEE Computer and Communications Societies, INFOCOM, IEEE, volume 1.Google Scholar
Feng, Q., Wu, Y., Sun, Y., Jiang, J., & Dai, Y. (2009). User behavior modeling in peer-to-peer file sharing networks: Dissecting download and removal actions. Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '09. Washington, DC, USA: IEEE Computer Society, pp. 34773480.Google Scholar
Ge, Z., Figueiredo, D. R., Jaiswal, S., Kurose, J., & Towsley, D. (2003). Modeling peer–peer file sharing systems. 22nd Annual Joint Conference of the IEEE Computer and Communications, INFOCOM, IEEE, volume 3, pp. 2188–2198.Google Scholar
Gomez-Rodriguez, M., Leskovec, J., & Krause, A. (2012). Inferring networks of diffusion and influence. ACM Transactions on Knowledge Discovery from Data, 5 (4), 21:121:37.Google Scholar
Gummadi, K. P., Dunn, R. J., Saroiu, S., Gribble, S. D., Levy, H. M., & Zahorjan, J. (2003). Measurement, modeling, and analysis of a peer-to-peer file-sharing workload. Proceedings of the 19th ACM Symposium on Operating Systems P principles, SOSP 2003, 37 (5), 314329.Google Scholar
Handurukande, S. B., Kermarrec, A.-M., Le Fessant, F., Massoulié, L., & Patarin, S. (2006). Peer sharing behaviour in the eDonkey network, and implications for the design of server-less file sharing systems. Proceedings of the 1st ACM Sigops/Eurosys European Conference on Computer Systems, EuroSys '06. New York, NY, USA: ACM, pp. 359–371.CrossRefGoogle Scholar
Hosanagar, K., Han, P., & Tan, Y. (2010). Diffusion models for peer-to-peer (p2p) media distribution: On the impact of decentralized, constrained supply. Information Systems Journals, 21 (2), 271287.Google Scholar
Hoßfeld, T., Leibnitz, K., Pries, R., Tutschku, K., Tran-Gia, P., & Pawlikowski, K. (2004). Information diffusion in eDonkey filesharing networks. Proceedings of the ATNAC 2004, p. 8.Google Scholar
Iamnitchi, A., Ripeanu, M., Santos-Neto, E., & Foster, I. (2011). The small world of file sharing. IEEE Transactions on Parallel and Distributed Systems, 22 (7), 11201134.CrossRefGoogle Scholar
Iribarren, J. L. & Moro, E. (2009). Impact of human activity patterns on the dynamics of information diffusion. Physical Review Letters, 103 (3), 038702.Google Scholar
Izal, M., Urvoy-Keller, G., Biersack, E., Felber, P., Hamra, A. Al, & Garces-Erice, L. (2004). Dissecting BitTorrent: Five months in a torrents lifetime. Passive and Active Measurements. Springer.Google Scholar
Jewell, N. P. (1982). Mixtures of exponential distributions. The Annals of Statistics, 10, 479484.Google Scholar
Karagiannis, T., Broido, A., Faloutsos, M., & Claffy, K. (2004). Transport layer identification of p2p traffic. Proceedings of the 4th ACM Sigcomm Conference on Internet Measurement. IMC '04. New York, NY, USA: ACM, pp. 121–134.Google Scholar
Karakaya, M., Korpeoglu, I., & Ulusoy, O. (2009). Free riding in peer-to-peer networks. IEEE Internet Computing, 13 (2), 9298.Google Scholar
Kleinberg, J. (2008). The convergence of social and technological networks. Communications of the ACM, 51 (11), 6672.CrossRefGoogle Scholar
Latapy, M., Magnien, C., & Vecchio, N. D. (2008). Basic notions for the analysis of large two-mode networks. Social Networks, 30 (1), 3148.Google Scholar
Leibnitz, K., Hossfeld, T., Wakamiya, N., & Murata, M. (2006). Modeling of epidemic diffusion in peer-to-peer file-sharing networks. Proceedings of the 2nd International Conference on Biologically Inspired Approaches to Advanced Information Technology. BioADIT'06. Berlin, Heidelberg: Springer-Verlag, pp. 322–329.Google Scholar
Lerman, K., & Ghosh, R. (2010). Information contagion: An empirical study of the spread of news on digg and twitter social networks. Proceedings of 4th International Conference on Weblogs and Social Media, volume 10, pp. 90–97.Google Scholar
Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N., & Hurst, M. (2007). Cascading behavior in large blog graphs. Proceedings of 7th SIAM International Conference on Data Mining (SDM), volume 7, pp. 551–556.Google Scholar
Liben-Nowell, D., & Kleinberg, J. (2008). Tracing information flow on a global scale using Internet chain-letter data. Proceedings of the National Academy of Sciences, 105 (12), 46334638.Google Scholar
Locher, T., Mysicka, D., Schmid, S., & Wattenhofer, R. (2009). A peer activity study in edonkey & kad. International Workshop on Dynamic Networks: Algorithms and Security, DYNAS.Google Scholar
Menasche, D. S., de Aragao, R., Antonio, A., Li, B., Towsley, D., & Venkataramani, A. (2009). Modeling unavailability in peer-to-peer systems. Proceedings of the 28th IEEE International Conference on Computer Communications Workshops. INFOCOM'09. Piscataway, NJ, USA: IEEE Press, pp. 375–376.Google Scholar
Onnela, J.-P., Saramäki, J., Hyvönen, J., Szabó, G., Lazer, D., Kaski, K., . . . Barabási, A.-L. (2007). Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences, 104 (18), 73327336.Google Scholar
Qiu, D., & Srikant, R. (2004). Modeling and performance analysis of bittorrent-like peer-to-peer networks. Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications. SIGCOMM '04. New York, NY, USA: ACM, pp. 367–378.Google Scholar
Schlosser, M. T., Condie, T. E., Kamvar, S. D. & Kamvar, Ar D. (2002). Simulating a p2p file-sharing network. First Workshop on Semantics in p2p and Grid Computing.Google Scholar
Sen, S., & Wang, J. (2004). Analyzing peer-to-peer traffic across large networks. IEEE/ACM Transactions on Networking, 12 (2), 219232.Google Scholar
TorrentFreak. (2010). Cisco expects p2p traffic to double by 2014. Retrieved from http://torrentfreak.com/cisco-expects-p2p-traffic-to-double-by-2014-100611/.Google Scholar
Tutschku, K. & de Meer, H. (2003). A measurement study on signaling on Gnutella overlay networks. Proceedings of the Fachtagung - Kommunikation in Verteilten Systemen (kiVS), pp. 295–306.Google Scholar
Tutschku, K. (2004). A measurement-based traffic profile of the edonkey filesharing service. Proceedings of the 5th International Workshop on Passive and Active Network Measurement, pam 2004, Antibes juan-les-pins, France. Lecture Notes in Computer Science, volume 3015. Springer.Google Scholar
Xiangying, Y. & de Veciana, G. (2004). Service capacity of peer to peer networks. 23rd AnnualJoint Conference of the IEEE Computer and Communications Societies, INFOCOM, IEEE, volume 4.Google Scholar
Zhao, S., Stutzbach, D., & Rejaie, R. (2006). Characterizing files in the modern Gnutella network: A measurement study. 13th Annual Multimedia Computing and Networking, MMCN'06, pp. 1–13.Google Scholar