Hostname: page-component-78c5997874-fbnjt Total loading time: 0 Render date: 2024-11-15T02:13:01.115Z Has data issue: false hasContentIssue false

Matching biodiversity and ecology ontologies: challenges and evaluation results

Published online by Cambridge University Press:  09 March 2020

Naouel Karam
Affiliation:
Fraunhofer FOKUS, Berlin, Germany e-mail: naouel.karam@fokus.fraunhofer.de
Abderrahmane Khiat
Affiliation:
Fraunhofer IAIS, Sankt Augustin, Germany e-mail: abderrahmane.khiat@iais.fraunhofer.de
Alsayed Algergawy
Affiliation:
Friedrich-Schiller University of Jena, Germany e-mail: alsayed.algergawy@uni-jena.de
Melanie Sattler
Affiliation:
MARUM, University of Bremen, Germany e-mail: mbuss@marum.de
Claus Weiland
Affiliation:
Senckenberg Biodiversity and Climate Research Center, Frankfurt, Germany e-mail: claus.weiland@senckenberg.de
Marco Schmidt
Affiliation:
Senckenberg Biodiversity and Climate Research Center, Frankfurt, Germany e-mail: claus.weiland@senckenberg.de Palmengarten der Stadt Frankfurt, Germany e-mail: marco.schmidt@stadt-frankfurt.de

Abstract

Biodiversity research studies the variability and diversity of organisms, including variability within and between species with particular focus on the functional diversity of traits and their relationship to environment. Managing biodiversity data implies dealing with its heterogeneous nature using semantics and tailored ontologies. These are themselves differently conceived, and combining them in semantically enabled applications necessitates an effective alignment between their concepts. This paper describes the ontology matching of biodiversity- and ecology-related ontologies. We illustrate diverse challenges introduced by this kind of ontologies to ontology matching in general. Real use cases requiring pairwise alignments between environment and trait ontologies are introduced. We describe our experience creating a new track at the Ontology Alignment Evaluation Initiative designed for this specific domain and report on the results obtained by state-of-the-art participating systems. The biodiversity and ecology use case turns out to be a strong one for ontology matching, introducing new interesting challenges. Even if most of the matching systems perform relatively well in the proposed matching tasks, there is still room for improvement. We highlight possible directions in that matter and elaborate on our plan to further progress with the track.

Type
Research Article
Copyright
© Cambridge University Press 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Alves, C., Castro, J. A., Ribeiro, C., Honrado, J. P. & Lomba, A. 2018. Research data management in the field of ecology: an overview. In Proceedings of the International Conference on Dublin Core and Metadata Applications.Google Scholar
Annane, A., Bellahsene, Z., Azouaou, F. & Jonquet, C. 2016. Selection and combination of heterogeneous mappings to enhance biomedical ontology matching. In Knowledge Engineering and Knowledge Management - 20th International Conference, EKAW ’16, Bologna, Italy, November 1923, 2016, Proceedings.Google Scholar
Arnaud., E., Cooper., L., Shrestha., R., Menda., N., Nelson., R. T., Matteis., L., Skofic., M., Bastow., R., Jaiswal., P., Mueller., L. & McLaren., G. 2012. Towards a reference plant trait ontology for modeling knowledge of plant traits and phenotypes. In Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K ’12). INSTICC, SciTePress.Google Scholar
Arnold, P. & Rahm, E. 2014. Enriching ontology mappings with semantic relations. Data & Knowledge Engineering 93, 1–18. Selected Papers from the 17th East--European Conference on Advances in Databases and Information Systems.CrossRefGoogle Scholar
Bodenreider, O. 2004. The unified medical language system (UMLS): integrating biomedical terminology, Nucleic Acids Research 32(Database-Issue), 267–270.Google Scholar
Bruelheide, H., Dengler, J., Purschke, O., Lenoir, J., Jiménez-Alfaro, B, Hennekens, S. M., Botta-Dukét, Z., Chytrý, M., Field, R., Jansen, F., Kattge, J., Pillar, V. D., Schrodt, F., Mahecha, M. D., Peet, R. K., Sandel, B., van Bodegom, P., Altman, J., Alvarez-Dávila, E., Arfin Khan, M. A. S., Attorre, F., Aubin, I., Baraloto, C., Barroso, J. G., Bauters, M., Bergmeier, E., Biurrun, I., Bjorkman, A. D., Blonder, B., Čarni, A., Cayuela, L., Černý, T., Cornelissen, J. H. C., Craven, D., Dainese, M., Derroire, G., De Sanctis, M., Díaz, S., Doležal, J., Farfan-Rios, W., Feldpausch, T. R., Fenton, N. J., Garnier, E., Guerin, G. R., Gutiérrez, A. G., Haider, S., Hattab, T., Henry, G., Hérault, B., Higuchi, P., Hölzel, N., Homeier, J., Jentsch, A., Jürgens, N., Kącki, Z., Karger, D. N., Kessler, M., Kleyer, M., Knollová, I., Korolyuk, A. Y., Kühn, I., Laughlin, D. C., Lens, F., Loos, J., Louault, F., Lyubenova, M. I., Malhi, Y., Marcenò, C., Mencuccini, M., Müller, J. V., Munzinger, J., Myers-Smith, I. H.,Neill, D. A., Niinemets, Ü., Orwin, K. H., Ozinga, W. A., Penuelas, J., Pérez-Haase, A., Petřík, P., Phillips, O. L., Pärtel, M., Reich, P B., Römermann, C., Rodrigues, A. V., Sabatini, F. M., Sardans, J., Schmidt, M., Seidler, G., Silva Espejo, J. E., Silveira, M., Smyth, A., Sporbert, M., Svenning, J., Tang, Z., Thomas, R., Tsiripidis, I., Vassilev, K., Violle, C., Virtanen, R., Weiher, E., Welk, E., Wesche, K., Winter, M., Wirth, C. & Jandt, U. 2018. Global trait-environment relationships of plant communities. Nature Ecology & Evolution 2, 19061907.Google Scholar
Buttigieg, P. L., Morrison, N., Smith, B., Mungall, C. J. & Lewis, S. E. 2013a. The environment ontology: contextualising biological and biomedical entities. Journal of Biomedical Semantics 4(1), 43:143:9.CrossRefGoogle ScholarPubMed
Buttigieg, P. L., Pafilis, E., Lewis, S. E., Schildhauer, M. P., Walls, R. L. & Mungall, C. J. 2013b. The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation. Journal of Biomedical Semantics 7(1), 57:157:12.Google Scholar
Cooper, L., Meier, A., Laporte, M.-A., Elser, J. L., Mungall, C., Sinn, B. T., Cavaliere, D., Carbon, S., Dunn, N. A., Smith, B., Qu, B., Preece, J., Zhang, E., Todorovic, S., Gkoutos, G., Doonan, J. H., Stevenson, D. W., Arnaud, E. & Jaiswal, P. 2018. The planteome database: an integrated resource for reference ontologies, plant genomics and phenomics . Nucleic Acids Research 46(D1), D1168D1180.CrossRefGoogle ScholarPubMed
Cooper, L., Walls, R. L., Elser, J., Gandolfo, M. A., Stevenson, D. W., Smith, B., Preece, J., Athreya, B., Mungall, C. J., Rensing, S., Hiss, M., Lang, D., Reski, R., Berardini, T. Z., Li, D., Huala, E., Schaeffer, M., Menda, N., Arnaud, E., Shrestha, R., Yamazaki, Y. & Jaiswal, P. 2013. The plant ontology as a tool for comparative plant anatomy and genomic analyses. Plant and Cell Physiology 54(2), e1.CrossRefGoogle ScholarPubMed
Dahdul, W. M., Manda, P., Cui, H., Balhoff, J. P., Dececchi, T. A., Ibrahim, N., Lapp, H., Vision, T. J. & Mabee, P. M. 2018. Annotation of phenotypes using ontologies: a gold standard for the training and evaluation of natural language processing systems. Database 2018.CrossRefGoogle Scholar
David, J., Euzenat, J., Scharffe, F. & dos Santos, C. T. 2011. The alignment API 4.0. Semantic Web 2(1), 310.Google Scholar
Diaz, S., Purvis, A., Cornelissen, J. H. C., Mace, G. M., Donoghue, M. J., Ewers, R. M., Jordano, P. & Pearse, W. D. 2013. Functional traits, the phylogeny of function, and ecosystem service vulnerability. Ecology and Evolution 3(9), 29582975.Google Scholar
Diepenbroek, M., Glöckner, F. O., Grobe, P., Güntsch, A., Huber, R., König-Ries, B., Kostadinov, I., Nieschulze, J., Seeger, B., Tolksdorf, R. & Triebel, D. 2014. Towards an integrated biodiversity and ecological research data management and archiving platform: the German Federation for the Curation of Biological Data (GFBio). In 44. Jahrestagung der Gesellschaft für Informatik, Informatik 2014, Big Data - KomplexitÄt meistern, September 22–26, 2014, Stuttgart, Deutschland, pp. 17111721.Google Scholar
Diepenbroek, M., Schindler, U., Huber, R., Pesant, S., Stocker, M., Felden, J., Buss, M. & Weinrebe, M. 2017. Terminology supported archiving and publication of environmental science data in PANGAEA. Journal of Biotechnology 261, 177186.CrossRefGoogle ScholarPubMed
DiGiuseppe, N., Pouchard, L. C. & Noy, N. F. 2014. SWEET ontology coverage for earth system sciences. Earth Science Informatics 7(4), 249264.CrossRefGoogle Scholar
Djeddi, W. E. & Khadir, M. T. 2014. A novel approach using context-based measure for matching large scale ontologies. In Data Warehousing and Knowledge Discovery - 16th International Conference, DaWaK 2014, Munich, Germany, September 2–4, 2014. Proceedings.Google Scholar
Djeddi, W. E., Khadir, M. T. & Yahia, S. B. 2018. XMap results for OAEI 2018. In Proceedings of the 13th International Workshop on Ontology Matching Co-located with the 17th International Semantic Web Conference (ISWC ’18), USA, 2018.Google Scholar
Dragisic, Z., Ivanova, V., Lambrix, P., Faria, D., Jiménez-Ruiz, E. & Pesquita, C. 2016. User validation in ontology alignment. In International Semantic Web Conference.CrossRefGoogle Scholar
Duchateau, F. & Bellahsene, Z. 2016. YAM: a step forward for generating a dedicated schema matcher. Trans. Large-Scale Data- and Knowledge-Centered Systems 25. 10.1007/978-3-662-49534-6_5.CrossRefGoogle Scholar
Ehrig, M. 2006. Ontology alignment: bridging the semantic gap. In Semantic Web and Beyond: Computing for Human Experience. 10.1007/978-0-387-36501-5Google Scholar
Ehrig, M. 2007. Ontology Alignment: Bridging the Semantic Gap, Semantic Web and Beyond: Computing for Human Experience 4, Springer.Google Scholar
Euzenat, J. & Shvaiko, P. 2013. Ontology Matching, 2nd Edition. Springer.CrossRefGoogle Scholar
Faria, D., Pesquita, C., Santos, E., Cruz, I. F. & Couto, F. M. 2014. AgreementMakerLight 2.0: towards efficient large-scale ontology matching. In Proceedings of the ISWC 2014 Posters & Demonstrations Track a track within the 13th International Semantic Web Conference, ISWC ’14.Google Scholar
Faria, D., Pesquita, C., Santos, E., Palmonari, M., Cruz, I. F. & Couto, F. M. 2013. The AgreementMakerLight ontology matching system. In On the Move to Meaningful Internet Systems: OTM 2013 Conferences - Confederated International Conferences: CoopIS, DOA-Trusted Cloud, and ODBASE.CrossRefGoogle Scholar
Fichtmüller, D., Gleisberg, M., Karam, N., Müller-Birn, C. & Güntsch, A. 2017. Terminologies as a neglected part of research data: making supplementary research data available through the GFBio terminology service. In Proceedings of the 2nd International Workshop on Semantics for Biodiversity Co-located with 16th International Semantic Web Conference (ISWC ’17), Vienna, Austria, October 22, 2017.Google Scholar
Ghazvinian, A., Noy, N. F. & Musen, M. A. 2009. Creating mappings for ontologies in biomedicine: Simple methods work. In AMIA ’09, American Medical Informatics Association Annual Symposium, San Francisco, CA, USA, November 14–18, 2009. AMIA.Google Scholar
Gkoutos, G. V., Green, E. C. J., Mallon, A.-M., Hancock, J. M. & Davidson, D. 2005. Using ontologies to describe mouse phenotypes. Genome Biol 6(1), R8R8. https://www.ncbi.nlm.nih.gov/pubmed/15642100.CrossRefGoogle ScholarPubMed
Grau, B. C., Horrocks, I., Motik, B., Parsia, B., Patel-Schneider, P. F. & Sattler, U. 2008. OWL 2: the next step for OWL. Journal of Web Semantics 6(4), 309322.CrossRefGoogle Scholar
Harrow, I., Jiménez-Ruiz, E., Splendiani, A., Romacker, M., Woollard, P., Markel, S., Alam-Faruque, Y., Koch, M., Malone, J. & Waaler, A. 2017. Matching disease and phenotype ontologies in the ontology alignment evaluation initiative. Journal of Biomedical Semantics 8(1), 55:155:13.CrossRefGoogle ScholarPubMed
Hickey, L. J. 1973. Classification of the architecture of dicotyledonous leaves. American Journal of Botany 60(1), 1733.CrossRefGoogle Scholar
Hoehndorf, R., Alshahrani, M., Gkoutos, G. V., Gosline, G., Groom, Q., Hamann, T., Kattge, J., de Oliveira, S. M., Schmidt, M., Sierra, S., Smets, E., Vos, R. A. & Weiland, C. 2016. The flora phenotype ontology (FLOPO): tool for integrating morphological traits and phenotypes of vascular plants. Journal of Biomedical Semantics 7(1), 65.CrossRefGoogle ScholarPubMed
Hoehndorf, R., Weiland, C., Schmidt, M., Groom, Q., Gosline, G., Dressler, S. & Hamann, T. 2018. The Flora Phenotype Ontology (FLOPO) and the FLOPO knowledgebase. In Application of Semantic Technology in Biodiversity Science, Thessen, A. E. (ed). IOS Press, Chapter 6, pp. 107119.Google Scholar
IPBES 2018. The IPBES assessment report on land degradation and restoration.Google Scholar
Jiménez-Ruiz, E. & Grau, B. C. 2011. LogMap: logic-based and scalable ontology matching. In 10th International Semantic Web Conference, ISWC ’11, pp. 273288.Google Scholar
Jiménez-Ruiz, E., Grau, B. C. & Horrocks, I. 2013. Is my ontology matching system similar to yours? In Proceedings of the 8th International Workshop on Ontology Matching Co-located with the 12th International Semantic Web Conference (ISWC 2013), Sydney, Australia, October 21, 2013, Shvaiko, P., Euzenat, J., Srinivas, K., Mao, M. & Jiménez-Ruiz, E. (eds), CEUR Workshop Proceedings 1111. CEUR-WS.org, pp. 229230.Google Scholar
Kalfoglou, Y. & Schorlemmer, W. M. 2005. Ontology mapping: the state of the art. In Semantic Interoperability and Integration.Google Scholar
Karam, N., Müller-Birn, C., Gleisberg, M., Fichtmüller, D., Tolksdorf, R. & Güntsch, A. 2016. A terminology service supporting semantic annotation, integration, discovery and analysis of interdisciplinary research data. Datenbank-Spektrum 16(3), 195205.Google Scholar
Kattge, J., Ogle, K., Bnisch, G., Daz, S., Lavorel, S., Madin, J., Nadrowski, K., Nllert, S., Sartor, K. & Wirth, C. 2011. A generic structure for plant trait databases. Methods in Ecology and Evolution 2(2), 202213.Google Scholar
Kibbe, W. A., Arze, C., Felix, V., Mitraka, E., Bolton, E., Fu, G., Mungall, C. J., Binder, J. X., Malone, J., Vasant, D., Parkinson, H. E. & Schriml, L. M. 2015. Disease ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data. Nucleic Acids Research 43(Database-Issue), 10711078.Google Scholar
Klan, F., Faessler, E., Algergawy, A., König-Ries, B. & Hahn, U. 2017. Integrated semantic search on structured and unstructured data in the adonis system. In Proceedings of the 2nd International Workshop on Semantics for Biodiversity Co-located with 16th International Semantic Web Conference (ISWC ’17), Vienna, Austria, October 22, 2017.Google Scholar
Laadhar, A., Ghozzi, F., Megdiche, I., Ravat, F., Teste, O. & Gargouri, F. 2017. Pomap: An effective pairwise ontology matching system. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - (Volume 2), Funchal, Madeira, Portugal, November 1–3, 2017.Google Scholar
Löffler, F., Pfaff, C., Karam, N., Fichtmüller, D. & Klan, F. 2017. What do biodiversity scholars search for? Identifying high-level entities for biological metadata. In Proceedings of the 2nd International Workshop on Semantics for Biodiversity Co-located with 16th International Semantic Web Conference (ISWC ’17), Vienna, Austria, October 22, 2017.Google Scholar
Meilicke, C. 2011. Alignment Incoherence in Ontology Matching. PhD thesis, University of Mannheim.Google Scholar
Melnik, S., Garcia-Molina, H. & Rahm, E. 2002. Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In ICDE ’02.Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. & Dean, J. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS ’13. Curran Associates Inc., USA.Google Scholar
Mungall, C. J., Torniai, C., Gkoutos, G. V., Lewis, S. E. & Haendel, M. A. 2012. Uberon, an integrative multi-species anatomy ontology . Genome Biology 13(1), R5.CrossRefGoogle ScholarPubMed
Musen, M. A. 2015. The Protégé project: a look back and a look forward. AI Matters 1(4), 412.CrossRefGoogle Scholar
Raskin, R. G. & Pan, M. J. 2004. Knowledge representation in the semantic web for earth and environmental terminology (SWEET). Computers & Geosciences 31(9), 11191125.CrossRefGoogle Scholar
Shvaiko, P. & Euzenat, J. 2013. Ontology matching: State of the art and future challenges. IEEE Transactions on Knowledge and Data Engineering 25(1), 158176.Google Scholar
Smaili, F. Z., Gao, X. & Hoehndorf, R. 2018. OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction. CoRR abs/1804.10922.CrossRefGoogle Scholar
Solimando, A., Jiménez-Ruiz, E. & Guerrini, G. 2017. Minimizing conservativity violations in ontology alignments: algorithms and evaluation. Knowledge and Information Systems 51(3), 775819.CrossRefGoogle Scholar
Stevens, P. F. Glossary of the angiosperm phylogeny website, New York Botanical Garden. https://sweetgum.nybg.org/science/glossary/glossary-checklist/Google Scholar
Vos, R. A., Biserkov, J. V., Balech, B., Beard, N., Blissett, M., Brenninkmeijer, C., van Dooren, T., Eades, D., Gosline, G., Groom, Q. J., Hamann, T. D., Hettling, H., Hoehndorf, R., Holleman, A., Hovenkamp, P., Kelbert, P., King, D., Kirkup, D., Lammers, Y., DeMeulemeester, T., Mietchen, D., Miller, J. A., Mounce, R., Nicolson, N., Page, R., Pawlik, A., Pereira, S., Penev, L., Richards, K., Sautter, G., Shorthouse, D. P., Thtinen, M., Weiland, C., Williams, A. R. & Sierra, S. 2014. Enriched biodiversity data as a resource and service. Biodiversity Data Journal 2, e1125.CrossRefGoogle Scholar
Wagenitz, G. 2008. Wörterbuch der Botanik, Nikol.Google Scholar
Walls, R. L., Athreya, B., Cooper, L., Elser, J., Gandolfo, M. A., Jaiswal, P., Mungall, C. J., Preece, J., Rensing, S., Smith, B. & Stevenson, D. W. 2012. Ontologies as integrative tools for plant science. American Journal of Botany 99(8), 12631275.CrossRefGoogle ScholarPubMed
Wang, P. & Xu, B. 2009. An effective similarity propagation method for matching ontologies without sufficient or regular linguistic information. In The Semantic Web, Fourth Asian Conference, ASWC 2009, Shanghai, China, December 6–9, 2009. Proceedings, pp. 105119.Google Scholar
Wang, P., Zhou, Y. & Xu, B. 2011. Matching large ontologies based on reduction anchors. In IJCAI ’11, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16–22, 2011, pp. 23432348.Google Scholar
Weiland, C., Kulmanov, M., Schmidt, M. & Hoehndorf, R. 2019. A machine learning based approach for similarity search on biodiversity knowledge graphs. Biodiversity Information Science and Standards 3.CrossRefGoogle Scholar
Whetzel, P. L., Noy, N. F., Shah, N., Alexander, P. R., Dorf, M., Fergerson, R. W., Storey, M. D., Smith, B., Chute, C. G. & Musen, M. A. 2011. Bioportal: ontologies and integrated data resources at the click of a mouse. In O. Bodenreider, M. E. Martone & A. Ruttenberg, eds, Proceedings of the 2nd International Conference on Biomedical Ontology, Buffalo, NY, USA, July 26–30, 2011, CEUR Workshop Proceedings 833, CEUR-WS.org.Google Scholar
Wilkinson, M. D., Dumontier, M., Aalbersberg & al. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3, 160018+.Google Scholar
Yilmaz, P., Kottmann, R., Field, D., Knight, R. & al. 2011. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (mixs) specifications. Nature Biotechnology 29(5), 415420.Google Scholar