Hostname: page-component-78c5997874-dh8gc Total loading time: 0 Render date: 2024-11-15T06:26:59.398Z Has data issue: false hasContentIssue false

Data mining for building knowledge bases: techniques, architectures and applications

Published online by Cambridge University Press:  31 March 2016

Alfred Krzywicki
Affiliation:
School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia e-mail: alfredk@cse.unsw.edu.au, wobcke@cse.unsw.edu.au, mike@cse.unsw.edu.au, jcalvo@cse.unsw.edu.au, compton@cse.unsw.edu.au
Wayne Wobcke
Affiliation:
School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia e-mail: alfredk@cse.unsw.edu.au, wobcke@cse.unsw.edu.au, mike@cse.unsw.edu.au, jcalvo@cse.unsw.edu.au, compton@cse.unsw.edu.au
Michael Bain
Affiliation:
School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia e-mail: alfredk@cse.unsw.edu.au, wobcke@cse.unsw.edu.au, mike@cse.unsw.edu.au, jcalvo@cse.unsw.edu.au, compton@cse.unsw.edu.au
John Calvo Martinez
Affiliation:
School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia e-mail: alfredk@cse.unsw.edu.au, wobcke@cse.unsw.edu.au, mike@cse.unsw.edu.au, jcalvo@cse.unsw.edu.au, compton@cse.unsw.edu.au
Paul Compton
Affiliation:
School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia e-mail: alfredk@cse.unsw.edu.au, wobcke@cse.unsw.edu.au, mike@cse.unsw.edu.au, jcalvo@cse.unsw.edu.au, compton@cse.unsw.edu.au

Abstract

Data mining techniques for extracting knowledge from text have been applied extensively to applications including question answering, document summarisation, event extraction and trend monitoring. However, current methods have mainly been tested on small-scale customised data sets for specific purposes. The availability of large volumes of data and high-velocity data streams (such as social media feeds) motivates the need to automatically extract knowledge from such data sources and to generalise existing approaches to more practical applications. Recently, several architectures have been proposed for what we call knowledge mining: integrating data mining for knowledge extraction from unstructured text (possibly making use of a knowledge base), and at the same time, consistently incorporating this new information into the knowledge base. After describing a number of existing knowledge mining systems, we review the state-of-the-art literature on both current text mining methods (emphasising stream mining) and techniques for the construction and maintenance of knowledge bases. In particular, we focus on mining entities and relations from unstructured text data sources, entity disambiguation, entity linking and question answering. We conclude by highlighting general trends in knowledge mining research and identifying problems that require further research to enable more extensive use of knowledge bases.

Type
Articles
Copyright
© Cambridge University Press, 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agarwal, A., Chapelle, O., Dudík, M. & Langford, J. 2014. A reliable effective terascale linear learning system. Journal of Machine Learning Research 15, 11111133.Google Scholar
Aggarwal, C. C. & Zhai, C. 2012. Mining Text Data. Springer.CrossRefGoogle Scholar
Agichtein, E. & Gravano, L. 2000. Snowball: extracting relations from large plain-text collections. In Proceedings of the Fifth ACM Conference on Digital Libraries, 85–94.Google Scholar
Agrawal, R. & Srikant, R. 1995. Mining sequential patterns. In Proceedings of the Eleventh International Conference on Data Engineering, 3–14.Google Scholar
Althoff, T., Dong, X. L., Murphy, K., Alai, S., Dang, V. & Zhang, W. 2015. TimeMachine: timeline generation for knowledge-base entities. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 19–28.Google Scholar
Angeli, G., Gupta, S., Premkumar, M. J., Manning, C. D., , C., Tibshirani, J., Wu, J. Y., Wu, S. & Zhang, C. 2014. Stanford’s distantly supervised slot filling systems for KBP 2014. In Proceedings of the Seventh Text Analysis Conference.Google Scholar
Antoniou, G. & van Harmelen, F. 2009. Web ontology language (OWL). In Handbook on Ontologies, Staad S. & Studer R. (eds). Springer, 91110.CrossRefGoogle Scholar
Asr, F. T., Sonntag, J., Grishina, Y. & Stede, M. 2014. Conceptual and practical steps in event coreference analysis of large-scale data. In Proceedings of the Second Workshop on EVENTS: Definition, Detection, Coreference and Representation, 35–44.Google Scholar
Baena-García, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavaldà, R. & Morales-Bueno, R. 2004. Early drift detection method. In Proceedings of the Fourth International Workshop on Knowledge Discovery from Data Streams, 77–86.Google Scholar
Becker, H., Iter, D., Naaman, M. & Gravano, L. 2012. Identifying content for planned events across social media sites. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, 533–542.Google Scholar
Becker, H., Naaman, M. & Gravano, L. 2011. Beyond Trending Topics: Real-World Event Identification on Twitter. Technical report CUCS-012-11, Department of Computer Science, Columbia University.Google Scholar
Beltagy, I., Erk, K. & Mooney, R. 2014. Probabilistic soft logic for semantic textual similarity. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 1210–1219.Google Scholar
Berant, J., Chou, A., Frostig, R. & Liang, P. 2013. Semantic parsing on Freebase from question-answer pairs. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 1533–1544.Google Scholar
Biemann, C. 2005. Ontology learning from text: a survey of methods. Journal for Language Technology and Computational Linguistics 20, 7593.Google Scholar
Bifet, A. & Gavaldà, R. 2006. Learning from time-changing data with adaptive windowing. In Proceedings of the Sixth SIAM International Conference on Data Mining, 443–448.Google Scholar
Blei, D. M., Ng, A. Y. & Jordan, M. I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3, 9931022.Google Scholar
Bollacker, K., Tufts, P., Pierce, T. & Cook, R. 2007. A platform for scalable, collaborative, structured information integration. In Proceedings of the Sixth International Workshop on Information Integration on the Web, 22–27.Google Scholar
Bröcheler, M., Mihalkova, L. & Getoor, L. 2010. Probabilistic similarity logic. In Proceedings of the Twenty-Sixth Annual Conference on Uncertainty in Artificial Intelligence, 73–82.Google Scholar
Brunzel, M 2008. The XTREEM methods for ontology learning from web documents. In Ontology Learning and Population: Bridging the Gap Between Text and Knowledge, Buitelaar P. & Cimiano P. (eds). IOS Press, 326.Google Scholar
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka, E. R. & Mitchell, T. M. 2010. Toward an architecture for never-ending language learning. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 1306–1313.Google Scholar
Chai, X., Deshpande, O., Garera, N., Gattani, A., Lam, W., Lamba, D. S., Liu, L., Tiwari, M., Tourn, M., Vacheri, Z., Prasad, S. T. S., Subramaniam, S., Harinarayan, V., Rajaraman, A., Ardalan, A., Das, S., Suganthan G. C., P. & Doan, A. 2013. Social media analytics: the Kosmix story. IEEE Data Engineering Bulletin 36, 412.Google Scholar
Chen, Y. & Wang, D. Z. 2014. Knowledge expansion over probabilistic knowledge bases. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, 649–660.Google Scholar
Chen, Z. & Ji, H. 2011. Collaborative ranking: a case study on entity linking. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 771–781.Google Scholar
Cheng, Z., Caverlee, J. & Lee, K. 2010. You are where you tweet: a content-based approach to geo-locating Twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, 759–768.Google Scholar
Cimiano, P., Lopez, V., Unger, C., Cabrio, E., Ngomo, A.-C. N. & Walter, S. 2013. Multilingual Question Answering over Linked Data (QALD-3): lab overview. In Information Access Evaluation. Multilinguality, Multimodality, and Visualization, Forner P., Müller H., Paredes R., Rosso P. & Stein B. (eds). Springer-Verlag, 321332.Google Scholar
Clarke, J., Merhav, Y., Suleiman, G., Zheng, S. & Murgatroyd, D. 2012. Basis technology at TAC 2012 entity linking. In Proceedings of the Fifth Text Analysis Conference.Google Scholar
Compton, P. & Jansen, R. 1990. A philosophical basis for knowledge acquisition. Knowledge Acquisition 2, 241258.CrossRefGoogle Scholar
Cortes, C. & Vapnik, V. 1995. Support-vector networks. Machine Learning 20, 273297.CrossRefGoogle Scholar
Curran, J. R., Murphy, T. & Scholz, B. 2007. Minimising semantic drift with mutual exclusion bootstrapping. In Proceedings of the Tenth Conference of the Pacific Association for Computational Linguistics, 172–180.Google Scholar
Davis, A., Veloso, A., da Silva, A. S., Meira, W. J., & Laender, A. H. F. 2012. Named entity disambiguation in streaming data. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers , 1, 815–824.Google Scholar
Dellschaft, K. & Staab, S. 2006. On how to perform a gold standard based evaluation of ontology learning. In Proceedings of the 5th International Conference on the Semantic Web, 228–241.Google Scholar
Deshpande, O., Lamba, D. S., Tourn, M., Das, S., Subramaniam, S., Rajaraman, A., Harinarayan, V. & Doan, A. 2013. Building, maintaining, and using knowledge bases: a report from the trenches. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, 1209–1220.Google Scholar
Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S. & Zhang, W. 2014. Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 601–610.Google Scholar
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D. S. & Yates, A. 2005. Unsupervised named-entity extraction from the web: an experimental study. Artificial Intelligence 165, 91134.Google Scholar
Fan, J., Kalyanpur, A., Gondek, D. C. & Ferrucci, D. A. 2012. Automatic knowledge extraction from documents. IBM Journal of Research and Development 56, 5:15:10.Google Scholar
Fayyad, U., Piatetsky-Shapiro, G. & Smyth, P. 1996. From data mining to knowledge discovery in databases. AI Magazine 17, 3754.Google Scholar
Ferré, S. 2013. Squall2sparql: a translator from controlled English to full SPARQL 1.1. In Proceedings of the Question Answering over Linked Data (QALD-3).Google Scholar
Ferrucci, D. A. 2012. Introduction to ‘This is Watson’. IBM Journal of Research and Development 56, 1:11:15.Google Scholar
Fung, G. P. C., Yu, J. X., Yu, P. S. & Lu, H. 2005. Parameter free bursty events detection in text streams. In Proceedings of the 31st International Conference on Very Large Data Bases, 181–192.Google Scholar
Furht, B. & Escalante, A. 2011. Handbook of Data Intensive Computing. Springer Science & Business Media.Google Scholar
Gama, J. 2012. A survey on learning from data streams: current and future trends. Progress in Artificial Intelligence 1, 4555.CrossRefGoogle Scholar
Gama, J., Medas, P., Castillo, G. & Rodrigues, P. 2004. Learning with drift detection. In Advances in Artificial Intelligence, Bazzan A. L. C. & Labidi S. (eds). Springer-Verlag, 66112.Google Scholar
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M. & Bouchachia, A. 2014. A survey on concept drift adaptation. ACM Computing Surveys (CSUR) 46, 44.Google Scholar
Gao, D., Li, X. C. W., Zhang, R. & Ouyang, Y. 2014. Sequential summarization: a full view of Twitter trending topics. IEEE Transactions on Knowledge and Data Engineering 22, 296302.Google Scholar
Gattani, A., Lamba, D. S., Garera, N., Tiwari, M., Chai, X., Das, S., Subramaniam, S., Rajaraman, A., Harinarayan, V. & Doan, A. 2013. Entity extraction, linking, classification, and tagging for social media: a Wikipedia-based approach. Proceedings of the VLDB Endowment 6, 11261137.Google Scholar
Geng, L. & Hamilton, H. J. 2006. Interestingness measures for data mining: a survey. ACM Computing Surveys (CSUR) 38, 132.Google Scholar
Gruber, T. R. 1993. A translation approach to portable ontology specifications. Knowledge Acquisition 5, 199220.Google Scholar
Guo, W., Li, H., Ji, H. & Diab, M. T. 2013. Linking tweets to news: a framework to enrich short text data in social media. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 239–248.Google Scholar
Gupta, A., Mumick, I. S. & Subrahmanian, V. S. 1993. Maintaining views incrementally. ACM SIGMOD Record 22, 157166.Google Scholar
Han, J., Kamber, M. & Pei, J. 2011. Data Mining: Concepts and Techniques. Elsevier.Google Scholar
He, S., Liu, S., Chen, Y., Zhou, G., Liu, K. & Zhao, J. 2013. CASIA@QALD-3: a question answering system over linked data. In Proceedings of the Question Answering over Linked Data (QALD-3).Google Scholar
Ho, V. H., Wobcke, W. & Compton, P. 2003. EMMA: an e-mail management assistant. In Proceedings of the 2003 IEEE/WIC International Conference on Intelligent Agent Technology, 67–74.Google Scholar
Hoffart, J., Suchanek, F. M., Berberich, K. & Weikum, G. 2013. YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence 194, 2861.Google Scholar
Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L. & Weld, D. S. 2011. Knowledge-based weak supervision for information extraction of overlapping relations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 541–550.Google Scholar
Hua, W., Wang, Z., Wang, H., Zheng, K. & Zhou, X. 2015. Short text understanding through lexical-semantic analysis. In 2015 IEEE 31st International Conference on Data Engineering (ICDE), 495–506.Google Scholar
Huang, H., Cao, Y., Huang, X., Ji, H. & Lin, C.-Y. 2014. Collective tweet wikification based on semi-supervised graph regularization. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 380–389.Google Scholar
Huang, R. & Riloff, E. 2013. Multi-faceted event recognition with bootstrapped dictionaries. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 41–51.Google Scholar
Hulten, G., Spencer, L. & Domingos, P. 2001. Mining time-changing data streams. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 97–106.Google Scholar
Ji, H. & Grishman, R. 2011. Knowledge base population: successful approaches and challenges. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 1, 1148–1158.Google Scholar
Ji, H., Grishman, R. & Dang, H. T. 2011. Overview of the TAC 2011 knowledge base population track. In Proceedings of the Fourth Text Analysis Conference.Google Scholar
Ji, H., Grishman, R., Dang, H. T., Griffitt, K. & Ellis, J. 2010. Overview of the TAC 2010 knowledge base population track. In Proceedings of the Third Text Analysis Conference.Google Scholar
Kim, M. H. & Compton, P. 2012a. Improving open information extraction for informal web documents with ripple-down rules. In Knowledge Management and Acquisition for Intelligent Systems, Richards D. & Kang B. H. (eds). Springer-Verlag, 160174.CrossRefGoogle Scholar
Kim, M. H. & Compton, P. 2012b. Improving the performance of a named entity recognition system with knowledge acquisition. In Proceedings of the 18th International Conference on Knowledge Engineering and Knowledge Management, 97–113.Google Scholar
Kotov, A., Zhai, C. & Sproat, R. 2011. Mining named entities with temporally correlated bursts from multilingual web news streams. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, 237–246.Google Scholar
Koychev, I. 2000. Gradual forgetting for adaptation to concept drift. In Proceedings of the ECAI Workshop Current Issues in Spatio-Temporal Reasoning, 101–106.Google Scholar
Krzywicki, A. & Wobcke, W. 2010. Exploiting concept clumping for efficient incremental e-mail categorization. In Advanced Data Mining and Applications, Cao L., Feng Y. & Zhong J. (eds). Springer-Verlag, 244258.Google Scholar
Krzywicki, A. & Wobcke, W. 2011. Exploiting concept clumping for efficient incremental news article categorization. In Advanced Data Mining and Applications, Tang J., King I., Chen L. & Wang J. (eds). Springer-Verlag, 353366.Google Scholar
Kumar, R., Raghavan, P., Rajagopalan, S. & Tomkins, A. 1999. Extracting large-scale knowledge bases from the web. In Proceedings of the 25th International Conference on Very Large Data Bases, 639–650.Google Scholar
Lafferty, J. D., McCallum, A. & Pereira, F. C. N. 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, 282–289. Morgan Kaufmann Publishers.Google Scholar
Levenshtein, V. I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10, 707710.Google Scholar
Li, J., Wang, G. A. & Chen, H. 2011. Identity matching using personal and social identity features. Information Systems Frontiers 13, 101113.Google Scholar
Li, Y., Wang, C., Han, F., Han, J., Roth, D. & Yan, X. 2013. Mining evidences for named entity disambiguation. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1070–1078.Google Scholar
Liu, X., Li, Y., Wu, H., Zhou, M., Wei, F. & Lu, Y. 2013. Entity linking for tweets. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 1304–1311.Google Scholar
Liu, X., Zhang, S., Wei, F. & Zhou, M. 2011. Recognizing named entities in tweets. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 1, 359–367.Google Scholar
Maynard, D., Li, Y. & Peters, W. 2008. NLP techniques for term extraction and ontology population. In Ontology Learning and Population: Bridging the Gap Between Text and Knowledge, Buitelaar P. & Cimiano P. (eds). IOS Press, 107127.Google Scholar
McGarry, K. 2005. A survey of interestingness measures for knowledge discovery. The Knowledge Engineering Review 20, 3961.Google Scholar
Mendes, P. N., Jakob, M. & Bizer, C. 2012. DBpedia: a multilingual cross-domain knowledge base. In Proceedings of the Eighth International Conference on Language Resources and Evaluation, 1813–1817.Google Scholar
Mintz, M., Bills, S., Snow, R. & Jurafsky, D. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 1003–1011.Google Scholar
Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., Krishnamurthy, J., Lao, N., Mazaitis, K., Mohamed, T., Nakashole, N., Platanios, E., Ritter, A., Samadi, M., Settles, B., Wang, R., Wijaya, D., Gupta, A., Chen, X., Saparov, A., Greaves, M. & Welling, J. 2015. Never-ending learning. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2302–2310.Google Scholar
Monahan, S. & Brunson, M. 2014. Qualities of eventiveness. In Proceedings of the Second Workshop on EVENTS: Definition, Detection, Coreference and Representation, 59–67.Google Scholar
Monahan, S., Lehmann, J., Nyberg, T., Plymale, J. & Jung, A. 2011. Cross-lingual cross-document coreference with entity linking. In Proceedings of the Fourth Text Analysis Conference.Google Scholar
Napoles, C., Gormley, M. & Van Durme, B. 2012. Annotated Gigaword. In Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction, 95–100.Google Scholar
Nasukawa, T. & Nagano, T. 2001. Text analysis and knowledge mining system. IBM Systems Journal 40, 967984.Google Scholar
Nenkova, A. & McKeown, K. 2012. A Survey of Text Summarization Techniques. In Mining Text Data. Aggarwal C. C. and Zhai C. (eds). Springer Science+Business Media, 43–76.Google Scholar
Ottens, K., Aussenac-Gilles, N., Gleizes, M. P. & Camps, V. 2007. Dynamic ontology co-evolution from texts: principles and case study. In Proceedings of the International Workshop on Emergent Semantics and Ontology Evolution, 70–83.Google Scholar
Pan, J. Z. 2009. Resource description framework. In Handbook on Ontologies, Staad S. & Studer R. (eds). Springer, 7190.Google Scholar
Park, S. S., Kim, Y. S. & Kang, B. H. 2004. Personalized web document classification using MCRDR. In Proceedings of the Pacific Knowledge Acquisition Workshop 2004, 63–73.Google Scholar
Pham, S. B. & Hoffmann, A. 2005. Incremental knowledge acquisition for extracting temporal relations. In Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering, 354–359.Google Scholar
Ramakrishnan, N., Butler, P., Muthiah, S., Self, N., Khandpur, R., Saraf, P., Wang, W., Cadena, J., Vullikanti, A., Korkmaz, G., Kuhlman, C., Marathe, A., Zhao, L., Hua, T., Chen, F., Lu, C.-T., Huang, B., Srinivasan, A., Trinh, K., Getoor, L., Katz, G., Doyle, A., Ackermann, C., Zavorin, I., Ford, J., Summers, K., Fayed, Y., Arredondo, J., Gupta, D. & Mares, D. 2014. ‘Beating the news’ with EMBERS: forecasting civil unrest using open source indicators. In Proceedings of the Twentieth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1799–1808.Google Scholar
, C., Sadeghian, A. A., Shan, Z., Shin, J., Wang, F., Wu, S. & Zhang, C. 2014. Feature Engineering for Knowledge Base Construction. Data Engineering Bulletin 37, 2640.Google Scholar
Riloff, E. & Jones, R. 1999. Learning dictionaries for information extraction by multi-level bootstrapping. In Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, 474–479.Google Scholar
Ritter, A., Clark, S., Mausam, & Etzioni, O. 2011. Named entity recognition in tweets: an experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1524–1534.Google Scholar
Roth, B., Barth, T., Wiegand, M., Singh, M. & Klakow, D. 2013. Effective slot filling based on shallow distant supervision methods. In Proceedings of the Sixth Text Analysis Conference.Google Scholar
Rusu, D., Hodson, J. & Kimball, A. 2014. Unsupervised techniques for extracting and clustering complex events in news. In Proceedings of the Second Workshop on EVENTS: Definition, Detection, Coreference and Representation, 26–34.Google Scholar
Schrodt, P. A., Davis, S. G. & Weddle, J. L. 1994. Political science: KEDS—a program for the machine coding of event data. Social Science Computer Review 12, 561587.Google Scholar
Shin, J., Wu, S., Wang, F., Sa, C. D., Zhang, C. & , C. 2015. Incremental knowledge base construction using DeepDive. Proceedings of the VLDB Endowment 8, 13101321.Google Scholar
Silva, L. D. & Riloff, E. 2014. User type classification of tweets with implications for event recognition. In Proceedings of the Joint Workshop on Social Dynamics and Personal Attributes in Social Media, 98–108.Google Scholar
Stoyanov, V., Xu, J., Oard, D., Lawrie, D. & Finin, T. 2012. A context-aware approach to entity linking. In Proceedings of the NAACL Joint Workshop on Automatic Knowledge Base Construction and Web-Scale Knowledge Extraction, 62–67.Google Scholar
Suganthan, G. C, Sun, P. C., Krishna Gayatri, K., Zhang, H., Yang, F., Rampalli, N., Prasad, S., Arcaute, E., Krishnan, G., Deep, R., Raghavendra, V. & Doan, A. 2015. Why big data industrial systems need rules and what we can do about it. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 265–276.Google Scholar
Surdeanu, M. 2013. Overview of the TAC 2013 knowledge base population evaluation: English slot filling and temporal slot filling. In Proceedings of the Sixth Text Analysis Conference.Google Scholar
Tudorache, T., Noy, N. F., Tu, S. & Musen, M. A. 2008. Supporting collaborative ontology development in protégé. In The Semantic Web − ISWC 2008, Sheth A., Staab S., Dean M., Paolucci M., Maynard D., Finin T. & Thirunarayan K. (eds). Springer-Verlag, 1732.Google Scholar
Unger, C., Forascu, C., Lopez, V., Ngomo, A.-C. N., Cabrio, E., Cimiano, P. & Walter, S. 2014. Question Answering over Linked Data (QALD-4). CLEF 2014 Working Notes, 1172–1180.Google Scholar
Van Dyke Parunak, H., Rohwer, R., Belding, T. & Brueckner, S. 2007. Dynamic decentralized any-time hierarchical clustering. In Engineering Self-Organising Systems, Brueckner S., Hassas S., Jelasity M. & Yamins D. (eds). Springer-Verlag, 6681.Google Scholar
Veloso, A., Meira, W. Jr. & Zaki, M. J. 2006. Lazy associative classification. In Proceedings of the Sixth International Conference on Data Mining, 645–654.Google Scholar
Volker, J., Haase, P. & Hitzler, P. 2008. Learning expressive ontologies. In Ontology Learning and Population: Bridging the Gap Between Text and Knowledge, Buitelaar P. & Cimiano P. (eds). IOS Press, 4569.Google Scholar
Wang, Z., Zhao, K., Wang, H., Meng, X. & Wen, J.-R. 2015. Query understanding through knowledge-based conceptualization. In Proceedings of the International Joint Conference on Artificial Intelligence, 3264–3270.Google Scholar
Widmer, G. 1997. Tracking context changes through meta-learning. Machine Learning 27, 259286.Google Scholar
Witten, I. H., Frank, E. & Hall, M. A. 2011. Data Mining: Practical Machine Learning Tools and Techniques, 3rd edition. Morgan Kaufmann Publishers.Google Scholar
Wobcke, W., Krzywicki, A. & Chan, Y.-W. 2008. A large-scale evaluation of an e-mail management assistant. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 438–442.Google Scholar
Yang, Y., Carbonell, J. G., Brown, R. D., Pierce, T., Archibald, B. T. & Liu, X. 1999. Learning approaches for detecting and tracking news events. IEEE Intelligent Systems 14, 3243.Google Scholar
Yao, X. & Van Durme, B. 2014. Information extraction over structured data: question answering with Freebase. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 956–965.Google Scholar
Yu, D., Li, H., Cassidy, T., Li, Q., Huang, H., Chen, Z., Ji, H., Zhang, Y. & Roth, D. 2013. RPI-BLENDER TAC-KBP2013 knowledge base population system. In Proceedings of the Sixth Text Analysis Conference.Google Scholar
Zacks, J. M. & Tversky, B. 2001. Event structure in perception and conception. Psychological Bulletin 127, 321.Google Scholar
Zhang, W., Su, J., Chen, B., Wang, W., Toh, Z., Sim, Y., Cao, Y., Lin, C. Y. & Tan, C. L. 2011. I2R-NUS-MSRA at TAC 2011: entity linking. In Proceedings of the Fourth Text Analysis Conference.Google Scholar
Zhu, J., Nie, Z., Liu, X., Zhang, B. & Wen, J.-R. 2009. StatSnowball: a statistical approach to extracting entity relationships. In Proceedings of the 18th International Conference on World Wide Web, 101–110.Google Scholar
Zou, L., Huang, R., Wang, H., Yu, J. X., He, W. & Zhao, D. 2014. Natural language question answering over RDF: a graph data driven approach. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, 313–324.Google Scholar