Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-13T14:22:50.883Z Has data issue: false hasContentIssue false

Enhancement of Twitter event detection using news streams

Published online by Cambridge University Press:  24 January 2022

Samaneh Karimi
Affiliation:
School of Electrical and Computer Engineering, College of Engineering, University of Tehran, P.O. Box: 14395-515, Tehran, Iran Computer Science Department, University of Houston, Houston, TX 77204-5008, USA
Azadeh Shakery*
Affiliation:
School of Electrical and Computer Engineering, College of Engineering, University of Tehran, P.O. Box: 14395-515, Tehran, Iran Institute for Research in Fundamental Sciences (IPM), P.O. Box 19395-5746, Tehran, Iran
Rakesh M. Verma
Affiliation:
Computer Science Department, University of Houston, Houston, TX 77204-5008, USA
*
*Corresponding author. E-mail: shakery@ut.ac.ir

Abstract

A new framework for improving event detection is proposed that employs joint information in news media content and social networks, such as Twitter, to leverage detailed coverage of news media and the timeliness of social media. Specifically, a short text clustering method is employed to detect events from tweets, then the language model representations of the detected events are expanded using another set of events obtained from news articles published simultaneously. The expanded representations of events are employed as a new initialization of the clustering method to run another iteration and consequently enhance the event detection results. The proposed framework is evaluated using two datasets: a tweet dataset with event labels and a news dataset containing news articles published during the same time interval as the tweets. Experimental results show that the proposed framework improves the event detection results in terms of F1 measure compared to the results obtained from tweets only.

Type
Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abel, F., Gao, Q., Houben, G.-J. and Tao, K. (2011a). Analyzing user modeling on twitter for personalized news recommendations. In International Conference on User Modeling, Adaptation, and Personalization. Springer, pp. 1–12.Google Scholar
Abel, F., Gao, Q., Houben, G.-J. and Tao, K. (2011b). Semantic enrichment of twitter posts for user profile construction on the social web. In Extended Semantic Web Conference. Springer, pp. 375389.Google Scholar
Ahn, D. (2006). The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events, pp. 18.CrossRefGoogle Scholar
Allan, J. (2012). Topic Detection and Tracking: Event-based Information Organization, vol. 12. NJ, United States: Springer Science & Business Media.Google Scholar
Allan, J., Carbonell, J.G., Doddington, G., Yamron, J. and Yang, Y. (2018). Topic Detection and Tracking Pilot Study Final Report.Google Scholar
Atefeh, F. and Khreich, W. (2015). A survey of techniques for event detection in twitter. Computational Intelligence 31(1), 132164.CrossRefGoogle Scholar
Balalau, O., Castillo, C. and Sozio, M. (2018). Evidense: A graph-based method for finding unique high-impact events with succinct keyword-based descriptions. In Twelfth International AAAI Conference on Web and Social Media.CrossRefGoogle Scholar
Barthel, M., Shearer, E., Gottfried, J. and Mitchell, A. (2015). The evolving role of news on twitter and facebook. Pew Research Center, 14.Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I. and Lafferty, J. (2003). Latent Dirichletallocation. Journal of Machine Learning Research 3, 9931022.Google Scholar
Broersma, M. and Graham, T. (2013). Twitter as a news source: How dutch and british newspapers used tweets in their news coverage, 2007–2011. Journalism Practice 7(4), 446464.CrossRefGoogle Scholar
Chakraborty, A. (2018). Enhanced contextual recommendation using social media data. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, pp. 14551455.CrossRefGoogle Scholar
Chen, L., Zhang, H., Jose, J., Yu, H., Moshfeghi, Y. and Triantafillou, P. (2017). Topic detection and tracking on heterogeneous information. Journal of Intelligent Information Systems51(1), 115137.CrossRefGoogle Scholar
Chua, A.Y., Razikin, K. and Goh, D.H. (2011). Social tags as news event detectors. Journal of Information Science 37(1), 318.CrossRefGoogle Scholar
Dubey, A., Hefny, A., Williamson, S. and Xing, E.P. (2013). A nonparametric mixture model for topic modeling over time. In Proceedings of the 2013 SIAM International Conference on Data Mining. SIAM, pp. 530538.Google Scholar
Farzindar, A.A. and Inkpen, D. (2020). Natural language processing for social media, third edition. Synthesis Lectures on Human Language Technologies 13(2), 1219.CrossRefGoogle Scholar
Feng, W., Zhang, C., Zhang, W., Han, J., Wang, J., Aggarwal, C. and Huang, J. (2015). Streamcube: Hierarchical spatio-temporal hashtag clustering for event exploration over the twitter stream. In 2015 IEEE 31st International Conference on Data Engineering. IEEE, pp. 15611572.Google Scholar
Filatova, E. and Hatzivassiloglou, V. (2003). Domain -independent detection, extraction, and labeling of atomic events.Google Scholar
Fuglede, B. and Topsoe, F. (2004). Jensen-Shannon divergence and Hilbert space embedding. In International Symposium on Information Theory, 2004. ISIT 2004. Proceedings. IEEE, p. 31.Google Scholar
Guo, W., Li, H., Ji, H. and Diab, M. (2013). Linking tweets to news: A framework to enrich short text data in social media. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 239249.Google Scholar
Hoffman, M., Blei, D. and Bach, F. (2010). Online learning for latent dirichlet allocation. vol. 23, pp. 856864.Google Scholar
Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42(1), 177196.CrossRefGoogle Scholar
Ji, H. and Grishman, R. (2008). Refining event extraction through cross-document inference. In Proceedings of ACL-08: HLT, pp. 254262.Google Scholar
Kalyanam, J., Quezada, M., Poblete, B. and Lanckriet, G. (2016). Prediction and characterization of high-activity events in social media triggered by real-world news. PloS One 11(12), e0166694.CrossRefGoogle ScholarPubMed
Krestel, R., Werkmeister, T., Wiradarma, T.P. and Kasneci, G. (2015). Tweet-recommender: Finding relevant tweets for news articles. In Proceedings of the 24th International Conference on World Wide Web. ACM, pp. 5354.Google Scholar
Kuhn, H.W. (1955). The hungarian method for the assignment problem. Naval Research Logistics Quarterly 2(1–2), 8397.CrossRefGoogle Scholar
Kwak, H., Lee, C., Park, H. and Moon, S. (2010). What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. ACM, pp. 591600.Google Scholar
Li, Z., Wang, B., Li, M. and Ma, W.-Y. (2005). A probabilistic model for retrospective news event detection. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 106113.CrossRefGoogle Scholar
Liang, Y., Caverlee, J. and Cao, C. (2015). A noise-filtering approach for spatio-temporal event detection in social media. In Proceedings of the 37th European Conference on Information Retrieval, pp. 233244.CrossRefGoogle Scholar
Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Information Theory 28(2), 129137.CrossRefGoogle Scholar
Lourentzou, I., Dyer, G., Sharma, A. and Zhai, C. (2015). Hotspots of news articles: Joint mining of news text & social media to discover controversial points in news. In 2015 IEEE International Conference on Big Data (Big Data). IEEE, pp. 29482950.CrossRefGoogle Scholar
McCreadie, R., Macdonald, C. and Ounis, I. (2013). News vertical search: When and what to display to users. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, pp. 253262.Google Scholar
Mele, I., Bahrainian, S.A. and Crestani, F. (2019). Event mining and timeliness analysis from heterogeneous news streams. Information Processing & Management 56(3), 969993.CrossRefGoogle Scholar
Paltoglou, G. (2016). Sentiment-based event detection in twitter. Journal of the Association for Information Science and Technology 67(7), 15761587.CrossRefGoogle Scholar
Petrovic, S., Osborne, M., McCreadie, R., Macdonald, C., Ounis, I. and Shrimpton, L. (2013). Can twitter replace newswire for breaking news? In Seventh International AAAI Conference on Weblogs and Social Media.Google Scholar
Popescu, A.-M. and Pennacchiotti, M. (2010). Detecting controversial events from twitter. In Proceedings of the 19th ACM International ‘Conference on Information and Knowledge Management. ACM, pp. 1873–1876.Google Scholar
Ramisa, A., Yan, F., Moreno-Noguer, F. and Mikolajczyk, K. (2018). Breakingnews: Article annotation by image and text processing. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(5), 10721085.CrossRefGoogle ScholarPubMed
Rudra, K., Goyal, P., Ganguly, N., Mitra, P. and Imran, M. (2018). Identifying sub-events and summarizing disaster-related information from microblogs. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, pp. 265274.Google Scholar
Shi, B., Ifrim, G. and Hurley, N. (2014). Be in the know: Connecting news articles to relevant twitter conversations. arXiv preprint arXiv:1405.3117.Google Scholar
Verma, R., Karimi, S., Lee, D., Gnawali, O. and Shakery, A. (2019). Newswire versus social media for disaster response and recovery. In 2019 Resilience Week (RWS), vol. 1. IEEE, pp. 132141.Google Scholar
Xie, W., Zhu, F., Jiang, J., Lim, E.-P. and Wang, K. (2016). Topicsketch: Real-time bursty topic detection from twitter. IEEE Transactions on Knowledge and Data Engineering 28(8), 22162229.CrossRefGoogle Scholar
Xu, G., Meng, Y., Chen, Z., Qiu, X., Wang, C. and Yao, H. (2019). Research on topic detection and tracking for online news texts. IEEE Access 7, 5840758418.CrossRefGoogle Scholar
Yang, S.-F. and Rayz, J.T. (2017). An event detection approach based on twitter hashtags. In The 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2017).Google Scholar
Yang, Y., Carbonell, J.G., Brown, R.D., Pierce, T., Archibald, B.T. and Liu, X. (1999). Learning approaches for detecting and tracking news events. IEEE Intelligent Systems and Their Applications 14(4), 3243.CrossRefGoogle Scholar
Yin, J. and Wang, J. (2014). A dirichlet multinomial mixture model-based approach for short text clustering. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp. 233242.Google Scholar