Cluster-based mention typing for named entity disambiguation

Arda Çelebi; Arzucan Özgür

doi:10.1017/S1351324920000443

Cluster-based mention typing for named entity disambiguation

Published online by Cambridge University Press: 20 August 2020

Arda Çelebi and

Arzucan Özgür

Show author details

Arda Çelebi*: Affiliation:
Department of Computer Engineering, Boğaziçi University, Bebek, 34342 İstanbul, Turkey
Arzucan Özgür*: Affiliation:
Department of Computer Engineering, Boğaziçi University, Bebek, 34342 İstanbul, Turkey
*: Corresponding authors. E-mails: arzucan.ozgur@boun.edu.tr; ardax.celebi@gmail.com
Corresponding authors. E-mails: arzucan.ozgur@boun.edu.tr; ardax.celebi@gmail.com

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

An entity mention in text such as “Washington” may correspond to many different named entities such as the city “Washington D.C.” or the newspaper “Washington Post.” The goal of named entity disambiguation (NED) is to identify the mentioned named entity correctly among all possible candidates. If the type (e.g., location or person) of a mentioned entity can be correctly predicted from the context, it may increase the chance of selecting the right candidate by assigning low probability to the unlikely ones. This paper proposes cluster-based mention typing for NED. The aim of mention typing is to predict the type of a given mention based on its context. Generally, manually curated type taxonomies such as Wikipedia categories are used. We introduce cluster-based mention typing, where named entities are clustered based on their contextual similarities and the cluster ids are assigned as types. The hyperlinked mentions and their context in Wikipedia are used in order to obtain these cluster-based types. Then, mention typing models are trained on these mentions, which have been labeled with their cluster-based types through distant supervision. At the NED phase, first the cluster-based types of a given mention are predicted and then, these types are used as features in a ranking model to select the best entity among the candidates. We represent entities at multiple contextual levels and obtain different clusterings (and thus typing models) based on each level. As each clustering breaks the entity space differently, mention typing based on each clustering discriminates the mention differently. When predictions from all typing models are used together, our system achieves better or comparable results based on randomization tests with respect to the state-of-the-art levels on four defacto test sets.

Keywords

Named entity disambiguation Clustering Mention typing Information extraction

Information

Type: Article
Information: Natural Language Engineering , Volume 28 , Issue 1 , January 2022 , pp. 1 - 37

DOI: https://doi.org/10.1017/S1351324920000443 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Auer, S. Bizer, C. Kobilarov, G. Lehmann, J. Cyganiak, R. and Ives, Z. (2007). DBpedia: A nucleus for a web of open data. In Proceedings of the 6th International Semantic Web Conference (ISWC), pp. 722–735.CrossRef Google Scholar

Baroni, M. Dinu, G. and Kruszewski, G. (2014). Dont count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of Association for Computational Linguistics (ACL), vol. 1, pp. 238–247.Google Scholar

Beheshti, S. Benatallah, B. Venugopal, S. Ryu, S.H. Motahari-Nezhad, H.R. and Wang, W. (2017). A systematic review and comparative analysis of cross-document coreference resolution methods and tools. Computing 99(4), 313–349.CrossRef Google Scholar

Bollacker, K. Evans, C. Paritosh, P. Sturge, T. and Taylor, J. (2008). Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1247–1250.CrossRef Google Scholar

Brown, P.F. Pietra, V.J.D. deSouza, P.V. Lai, J.C. and Mercer, R.L. (1992). Class-based N-gram models of natural language. Computational Linguistics 18(4), 467–479.Google Scholar

Bunescu, R. and Pasca, M. (2006). Using encyclopedic knowledge for named entity disambiguation. In Proceedings of European Chapter of the Association for Computational Linguistics (EACL), pp. 9–16.Google Scholar

Cardie, C. and Wagstaff, K. (1999). Noun phrase coreference as clustering. In Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 82–89.Google Scholar

Cheng, X. and Roth, D. (2013). Relational inference for wikification. In Proceedings of Conference on Natural Language Learning (CoNLL), pp. 260–269.Google Scholar

Clark, A. (2003). Combining distributional and morphological information for part of speech induction. In Proceedings of European Chapter of the Association for Computational Linguistics (EACL), pp. 59–66.CrossRef Google Scholar

Cucerzan, S. (2007). Large-scale named entity disambiguation based on wikipedia data. In Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 708–716.Google Scholar

Devlin, J. Chang, M. Lee, K. and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), vol. 1, pp. 4171–4186.Google Scholar

Dutta, S. and Weikum, G. (2015). A joint model for cross-document co-reference resolution and entity linking. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 846–856.CrossRef Google Scholar

Ester, M. Kriegel, H. Sander, J. and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Association for the Advancement of Artificial Intelligence (AAAI) Press, pp. 226–231.Google Scholar

Fang, W. Zhang, J. Wang, D. Chen, Z. and Li, M. (2013). Entity disambiguation by knowledge and text jointly embedding. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1787–1796.Google Scholar

Fang, Z. Cao, Y. Zhang, D. Li, Q. Zhang, Z. and Liu, Y. (2019). Joint entity linking with deep reinforcement learning. In Proceedings of The Web Conference (WWW), pp. 438–447.CrossRef Google Scholar

Ferragina, P. and Scaiella, U. (2010). Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1625–1628.CrossRef Google Scholar

Ganea, O.E. Ganea, M. Lucchi, A. Eickhoff, C. and Hofmann, T. (2016). Probabilistic bag-of-hyperlinks model for entity linking. In Proceedings of the 25th International Conference on World Wide Web, pp. 927–938.CrossRef Google Scholar

Ganea, O.E. and Hofmann, T. (2017). Deep joint entity disambiguation with local neural attention. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1667–1676.CrossRef Google Scholar

Goldhahn, D. Eckart, T. and Quasthoff, U. (2012). Building large monolingual dictionaries at the Leipzig corpora collection: From 100 to 200 languages. In Proceedings of Language Resources and Evaluation Conference (LREC), pp. 759–765.Google Scholar

Guo, Z. and Barbosa, D. (2016). Robust named entity disambiguation with random walks. In Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM), pp. 1–28.Google Scholar

Gupta, N. Singh, S. and Roth, D. (2017). Entity linking via joint encoding of types, descriptions, and context. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 2681–2690.CrossRef Google Scholar

Hachey, B. Radford, W. Nothman, J. Honnibal, M. and Curran, J. (2013). Evaluating entity linking with wikipedia. Artificial Intelligence 194, 130–150.CrossRef Google Scholar

Han, X. and Sun, L. (2012). An entity-topic model for entity linking. In Proceedings of Joint Conference on Empirical Methods in Natural Language Processing (EMNLP) and CoNLL, pp. 105–115.Google Scholar

Han, X. Sun, L. and Zhao, J. (2011). Collective entity linking in web text: A graph-based method. In Proceedings of the 34th International ACM SIGIR, pp. 765–774.CrossRef Google Scholar

Hakimov, S., ter Horst, H., Jebbara, S., Hartung, M. and Cimiano, P. (2016). Combining textual and graph-based features for named entity disambiguation using undirected probabilistic graphical models. In Knowledge Engineering and Knowledge Management (EKAW). Springer, pp. 288–302. doi: 10.1007/978-3-319-49004-5_19.CrossRef Google Scholar

Hendrickx, I. and Daelemans, W. (2007). Adding semantic information: Unsupervised clusters for coreference resolution. In Workshop notes on Machine Learning for Natural Language Processing.Google Scholar

Hoffart, J. Seufert, S. Nguyen, D.B. Theobald, M. and Weikum, G. (2012). Kore: Keyphrase overlap relatedness for entity disambiguation. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 545–554.CrossRef Google Scholar

Hoffart, J. Yosef, M.A. Bordino, I. Furstenau, H. Pinkal, M. Spaniol, M. Taneva, B. Thater, S. Weikum, G. Guo, Z. and Barbosa, D. (2011). Robust disambiguation of named entities in text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 782–792.Google Scholar

Huang, L. May, J. Pan, X. and Ji, H. (2016). Building a fine-grained entity typing system overnight for a new x (x= language, domain, genre). arXiv preprint arXiv:1603.03112.Google Scholar

Jin, X. and Han, J. (2011). Expectation maximization clustering. In Sammut C. and Webb G.I. (eds), Encyclopedia of Machine Learning. Boston, MA: Springer.Google Scholar

Kataria, S. Kumar, K.S. Rastogi, R. Sen, P. and Sengamedu, S.H. (2011). Entity disambiguation with hierarchical topic models. In Proceedings of the ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), pp. 1037–1045.CrossRef Google Scholar

Kneser, R. and Ney, H. (1993). Improved clustering techniques for class-based statistical language modeling. In Proceedings of Eurospeech, vol. 2, pp. 973–976.Google Scholar

Kulkarni, S. Singh, A. Ramakrishnan, G. and Chakrabarti, S. (2009). Collective annotation of wikipedia entities in web text. In Proceedings of the ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), pp. 457–466.CrossRef Google Scholar

Le, P. and Titov, I. (2018). Improving entity linking by modeling latent relations between mentions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1595–1604.CrossRef Google Scholar

Le, P. and Titov, I. (2019). Boosting entity linking performance by leveraging unlabeled documents. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1935–1945.CrossRef Google Scholar

Levy, O. and Goldberg, Y. (2017). Dependency-based word embeddings. In Proceedings of the 52nd Annual Meeting of Association for Computational Linguistics (ACL), pp. 302–308.Google Scholar

Ling, X. Singh, S. and Weld, D.S. (2015). Design challenges for entity linking. Transactions of the Association for Computational Linguistics 3, 315–328.CrossRef Google Scholar

Ling, X. and Weld, D.S. (2012). Fine-grained entity recognition. In Proceedings of Association for the Advancement of Artificial Intelligence (AAAI), vol. 12, pp. 94–100.Google Scholar

Liu, C. Li, F. Sun, X. and Han, H. (2019). Attention-based joint entity linking with entity embedding. Information 10(2), 46.CrossRef Google Scholar

Mahdisoltani, F. Biega, J. and Suchanek, F.M. (2015). YAGO3: A knowledge base from multilingual wikipedias. In Proceedings of Conference on Innovative Data Systems Research (CIDR).Google Scholar

Manning, C.D. Surdeanu, M. Bauer, J. Finkel, J. Bethard, S.J. and McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations, pp. 55–60.CrossRef Google Scholar

Mihalcea, R. and Csomai, A. (2007). Wikify! Linking documents to encyclopedic knowledge. In Proceedings of Conference on Information and Knowledge Management (CIKM), pp. 233–242.Google Scholar

Mikolov, T. Sutskever, I. Chen, K. Corrado, G. and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Proceedings of Neural Information Processing Systems (NIPS), pp. 3111–3119.Google Scholar

Miller, G.A. (1995). WordNet: A lexical database for English. Communications of the ACM 38(11), 39–41.CrossRef Google Scholar

Milne, D. and Witten, I.H. (2008). Learning to link with wikipedia. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), pp. 509–518.CrossRef Google Scholar

Monahan, S. Lehmann, J. Nyberg, T. Plymale, J. and Jung, A. (2011). Cross-lingual cross-document coreference with entity linking. In Text Analysis Conference (TAC) 2011 Workshop.Google Scholar

Murty, S. Verga, P. Vilnis, L. Radovanovic, I. and McCallum, A. (2018). Hierarchical losses and new resources for fine-grained entity typing and linking. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 97–109.CrossRef Google Scholar

Neelakantan, A. and Chang, M. (2015). Inferring missing entity type instances for knowledge base completion: New dataset and methods. In Proceedings of the 2015 Conference of the NAACL-HLT: Human Language Technologies, pp. 515–525.CrossRef Google Scholar

Ngomo, A.N. Roder, M. and Usbeck, R. (2014). Cross-document coreference resolution using latent features. In Proceedings of International Conference on Linked Data for Information Extraction (LD4IE), pp. 33–44.Google Scholar

Pasca, M. (2004). Acquisition of categorized named entities for web search. In Proceedings of Conference on Information and Knowledge Management, pp. 137–145.CrossRef Google Scholar

Onoe, Y. and Durrett, G. (2020). Fine-grained entity typing for domain independent entity linking. In Proceedings of Association for the Advancement of Artificial Intelligence (AAAI).CrossRef Google Scholar

Pereira, F. Tishby, N. and Lee, L. (1993). Distributional clustering of English words. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 183–190.CrossRef Google Scholar

Pershina, M. He, Y. and Grishman, R. (2015). Personalized page rank for named entity disambiguation. In Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL), pp. 238–243.CrossRef Google Scholar

Phan, M.C. Sun, A. Tay, Y. Han, J. and Li, C. (2017). Neupl: Attention-based semantic matching and pair-linking for entity disambiguation. In Proceedings of the Conference on Information and Knowledge Management (CIKM), pp. 1667–1676.CrossRef Google Scholar

Phan, M.C. Sun, A. Tay, Y. Han, J. and Li, C. (2018). Pair-linking for collective entity disambiguation: Two could be better than all. In Proceedings of Computing Research Repository (CoRR).Google Scholar

Radhakrishnan, P. Talukdar, P. and Varma, V. (2018). ELDEN: Improved entity linking using densified knowledge graphs. In Proceedings of NAACL-HLT 2018, pp. 1844–1853.CrossRef Google Scholar

Raiman, J. and Raiman, O. (2018). DeepType: Multilingual entity linking by neural type system evolution. In Proceedings of Association for the Advancement of Artificial Intelligence (AAAI).Google Scholar

Ratford, W. Hachey, B. Honnibal, M. Nothman, J. and Curran, J.R. (2011). Naive but effective NIL clustering baselines - CMCRC at TAC 2011. In Proceedings of Text Analysis Conference (TAC).Google Scholar

Ratinov, L. Roth, D. Downey, D. and Anderson, M. (2011). Local and global algorithms for disambiguation to wikipedia. In Proceedings of ACL-HLT, pp. 19–24.Google Scholar

Ren, X. El-Kishky, A. Wang, C. Tao, F. Voss, C.R. Ji, H. and Han, J. (2015). Clustype: Effective entity recognition and typing by relation phrase-based clustering. In Proceedings of Conference on Knowledge Discovery and Data Mining (KDD), pp. 995–1004.CrossRef Google Scholar

Rijsbergen, V.C.J. (1979). Information Retrieval, 2nd Edn. London: Butterworths.Google Scholar

Roder, M. Usbeck, R. Hellmann, S. Gerber, D. and Both, A. (2014). N3 - A collection of datasets for named entity recognition and disambiguation in the NLP interchange format. In Proceedings of Language Resources and Evaluation Conference (LREC), pp. 3529–3533.Google Scholar

Sil, A. Kundu, G. Florian, R. and Hamza, W. (2018). Neural cross-lingual entity linking. In Association for the Advancement of Artificial Intelligence (AAAI), pp. 5464–5472.Google Scholar

Singh, S. Subramanya, A. Pereira, F. and McCallum, A. (2011). Large-scale cross-document coreference using distributed inference and hierarchical models. In Proceedings of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 793–803.Google Scholar

Singh, S. Wick, M. and McCallum, A. (2010). Distantly labeling data for large scale cross-document coreference. CoRR, abs/1005.4298.Google Scholar

Slonim, N. and Tishby, N. (2001). The power of word clusters for text classification. In Proceedings of 23rd European Colloquium on Information Retrieval Research (ECIR), pp. 191–200.Google Scholar

Srivastava, N. Hinton, G. Krizhevsky, A. Sutskever, I. and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1929–1958.Google Scholar

Steinbach, M. Karypis, G. and Kumar, V. (2000). A comparison of document clustering techniques. In Proceedings of Workshop on Text Mining in Knowledge Discovery and Data Mining (KDD).Google Scholar

Steinley, D. (2006). K-means clustering: A half-century synthesis. The British Journal of Mathematical and Statistical Psychology 59, 1–34.CrossRef Google Scholar PubMed

Sun, Y. Lin, L. Yang, N. Ji, Z. and Wand, X. (2015). Modeling mention, context and entity with neural networks for entity disambiguation. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), pp. 1333–1339.Google Scholar

Teffera, H.T. (2010). Automatic Construction of Labeled Clusters of Named Entities for Information Retrieval. MSC Thesis, Universitat Des Saarlandes.Google Scholar

Wick, M. Singh, S. and McCallum, A. (2012). A discriminative hierarchical model for fast coreference at large scale. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 379–388.Google Scholar

Xie, R. Liu, Z. Jia, J. Luan, H. and Sun, M. (2016). Representation learning of knowledge graphs with entity descriptions. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 2659–2665.Google Scholar

Yaghoobzadeh, Y. and Schutze, H. (2015). Corpus-level fine-grained entity typing using contextual information. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 715–725.CrossRef Google Scholar

Yaghoobzadeh, Y. and Schutze, H. (2017). Multi-level representations for fine-grained typing of knowledge base entities. In Proceedings of the 15th Conference of the European Chapter of the ACL, vol. 1, pp. 578–589.CrossRef Google Scholar

Yamada, I. Shindo, H. Takeda, H. and Takefuji, Y. (2016). Joint learning of the embedding of words and entities for named entity disambiguation. In Proceedings of Conference on Natural Language Learning (CoNLL), pp. 250–259.CrossRef Google Scholar

Yang, Y. Irsoy, O. and Rahman, K.S. (2018). Collective entity disambiguation with structured gradient tree boosting. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 777–786.CrossRef Google Scholar

Yeh, A. (2000). More accurate tests for the statistical significance of result differences. In Proceedings of the 18th International Conference on Computational Linguistics (COLING), vol. 2, pp. 947–953.CrossRef Google Scholar

Yosef, M.A. Bauer, S. Hoffart, J. Spaniol, M. and Weikum, G. (2012). HYENA: Hierarchical type classification for entity names. In Proceedings of 24th International Conference on Computational Linguistics (COLING), pp. 1361–1370.Google Scholar

Zwicklbauer, S. Seifert, C. and Granitzer, M. (2016a). Robust and collective entity disambiguation through semantic embeddings. In Proceedings of the 39th International Conference on Research and Development in Information Retrieval (SIGIR), pp. 425–434.CrossRef Google Scholar

Zwicklbauer, S. Seifert, C. and Granitzer, M. (2016b). DoSeR - A knowledge-base-Agnostic framework for entity disambiguation using semantic embeddings. In The Semantic Web. Latest Advances and New Domains, ESWC’16. Springer.CrossRef Google Scholar

Article contents

Cluster-based mention typing for named entity disambiguation

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests