Hostname: page-component-78c5997874-4rdpn Total loading time: 0 Render date: 2024-11-11T01:54:10.776Z Has data issue: false hasContentIssue false

InferPortOIE: A Portuguese Open Information Extraction system with inferences

Published online by Cambridge University Press:  14 December 2018

Cleiton Fernando Lima Sena
Affiliation:
Formalisms and Semantic Applications Research Group (FORMAS), LaSiD/DCC/IME—Federal University of Bahia (UFBA), Av. Adhemar de Barros, s/n, Campus de Ondina, Salvador, Bahia, Brazil
Daniela Barreiro Claro*
Affiliation:
Formalisms and Semantic Applications Research Group (FORMAS), LaSiD/DCC/IME—Federal University of Bahia (UFBA), Av. Adhemar de Barros, s/n, Campus de Ondina, Salvador, Bahia, Brazil
*
*Corresponding author. Email: dclaro@ufba.br

Abstract

Nowadays, there is an increasing amount of digital data. In the case of the Web, daily, a vast collection of data is generated, whose contents are heterogeneous. A significant portion of this data is available in a natural language format. Open Information Extraction (Open IE) enables the extraction of facts from large quantities of texts written in natural language. In this work, we propose an Open IE method to extract facts from texts written in Portuguese. We developed two new rules that generalize the inference by transitivity and by symmetry. Consequently, this approach increases the number of implicit facts in a sentence. Our novel symmetric inference approach is based on a list of symmetric features. Our results confirmed that our method outstands close works both in precision and number of valid extractions. Considering the number of minimal facts, our approach is equivalent to the most relevant methods in the literature.

Type
Article
Copyright
© Cambridge University Press 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Akbik, A. andLoser, A. (2012). KrakeN: N-ary facts in open information extraction. In Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction, AKBC-WEKEX ’12. Montreal, Canada: Association for Computational Linguistics (ACL), pp. 5256.Google Scholar
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M. and Etzioni, O. (2007). Open Information extraction from the web. In Proceedings of the 20th International Joint Conference on Artifical Intelligence, IJCAI’07. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., pp. 2670–2676.Google Scholar
Banko, M. and Etzioni, O. (2008). The Tradeoffs Between Open and Traditional Relation Extraction, vol. 8. Stroudsburg, PA, USA: Association for Computational Linguistics (ACL), pp. 2836.Google Scholar
Bast, H. and Haussmann, E. (2013). Open information extraction via contextual sentence decomposition. In 2013 IEEE Seventh International Conference on Semantic Computing (ICSC). Irvine, CA, USA: IEEE, pp. 154159.CrossRefGoogle Scholar
Bast, H. and Haussmann, E. (2014). More informative open information extraction via simple inference. In Proceedings of the 36th European Conference on IR Research on Advances in Information Retrieval, ECIR 2014, vol. 8416. New York, NY, USA: Springer-Verlag New York, Inc., pp. 585590.Google Scholar
Carletta, J. (1996). Assessing agreement on classification tasks: The Kappa statistic. Computational Linguistics 22(2), 249254.Google Scholar
Chang, C.-H., Kayed, M., Girgis, M.R. and Shaala, K.F. (2006). A survey of web information extraction systems. IEEE Transactions on Knowledge and Data Engineering 18(10), 14111428.CrossRefGoogle Scholar
Del Corro, L. and Gemulla, R. (2013). ClausIE: Clause-based open information extraction. In Proceedings of the 22nd International Conference on World Wide Web, WWW ’13. New York, NY, USA: ACM, pp. 355366.Google Scholar
Etzioni, O., Banko, M., Soderland, S. and Weld, D.S. (2008). Open information extraction from the web. Communications of the ACM, 51(12), 6874.CrossRefGoogle Scholar
Fader, A., Soderland, S. and Etzioni, O. (2011). Identifying relations for open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 15351545.Google Scholar
Faruqui, M. and Kumar, S. (2015). Multilingual Open Relation Extraction Using Cross-lingual Projection. arXiv preprint. arXiv:1503.06450, abs/1503.06450 (May–June), pp. 13511356.CrossRefGoogle Scholar
Gamallo, P. and Garcia, M. (2015). Multilingual Open Information Extraction. Cham: Springer International Publishing, pp. 711722.Google Scholar
Gamallo, P., Garcia, M. and Fernández-Lanza, S. (2012). Dependency-based open information extraction. In Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP, ROBUS-UNSUP ’12. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 1018.Google Scholar
Godoy, L. (2008). Os verbos recíprocos no PB: interface sintaxe-semântica lexical. 2008. Dissertation (Mestrado em Estudos Linguísticos)-Faculdade de Letras, UFMG, Belo Horizonte.Google Scholar
Mausam, . (2016). Open information extraction systems and downstream applications. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16. New York, NY, USA: AAAI Press, pp. 40744077.Google Scholar
Mausam, Schmitz M., Bart, R., Soderland, S. and Etzioni, O. (2012). Open language learning for information extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL ’12. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 523534.Google Scholar
Moura Silva, W.D.C.d. (2013). Improving the Corrector Gramatical CoGrOO. PhD Thesis, University of São Paulo.Google Scholar
Neto, P.C. and Infante, U. (2003). Gramática da Língua Portuguesa. São Paulo: Scipione.Google Scholar
Sena, C.F.L., Glauber, R. and Claro, D.B. (2017). Inference approach to enhance a Portuguese open information extraction. In Proceedings of the 19th International Conference on Enterprise Information Systems—ICEIS, vol. 1. Porto, Portugal: ScitePress for INSTICC, pp. 442451.Google Scholar
Soderland, S. (1999). Learning information extraction rules for semi-structured and free text. Machine Learning 34(1-3), 233272.CrossRefGoogle Scholar
Wu, F. and Weld, D.S. (2010). Open information extraction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL ’10. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 118127.Google Scholar
Xavier, C.C., de Lima, V.L.S. and Souza, M. (2015). Open information extraction based on lexical semantics. Journal of the Brazilian Computer Society 21(1), 114.CrossRefGoogle Scholar