Interpretable ontology extension in chemistry

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web Pub Date : 2023-05-18 DOI:10.3233/sw-233183

Martin Glauer, A. Memariani, F. Neuhaus, T. Mossakowski, Janna Hastings

{"title":"Interpretable ontology extension in chemistry","authors":"Martin Glauer, A. Memariani, F. Neuhaus, T. Mossakowski, Janna Hastings","doi":"10.3233/sw-233183","DOIUrl":null,"url":null,"abstract":"Reference ontologies provide a shared vocabulary and knowledge resource for their domain. Manual construction and annotation enables them to maintain high quality, allowing them to be widely accepted across their community. However, the manual ontology development process does not scale for large domains. We present a new methodology for automatic ontology extension for domains in which the ontology classes have associated graph-structured annotations, and apply it to the ChEBI ontology, a prominent reference ontology for life sciences chemistry. We train Transformer-based deep learning models on the leaf node structures from the ChEBI ontology and the classes to which they belong. The models are then able to automatically classify previously unseen chemical structures, resulting in automated ontology extension. The proposed models achieved an overall F1 scores of 0.80 and above, improvements of at least 6 percentage points over our previous results on the same dataset. In addition, the models are interpretable: we illustrate that visualizing the model’s attention weights can help to explain the results by providing insight into how the model made its decisions. We also analyse the performance for molecules that have not been part of the ontology and evaluate the logical correctness of the resulting extension.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"4 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2023-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Semantic Web","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/sw-233183","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 6

Abstract

Reference ontologies provide a shared vocabulary and knowledge resource for their domain. Manual construction and annotation enables them to maintain high quality, allowing them to be widely accepted across their community. However, the manual ontology development process does not scale for large domains. We present a new methodology for automatic ontology extension for domains in which the ontology classes have associated graph-structured annotations, and apply it to the ChEBI ontology, a prominent reference ontology for life sciences chemistry. We train Transformer-based deep learning models on the leaf node structures from the ChEBI ontology and the classes to which they belong. The models are then able to automatically classify previously unseen chemical structures, resulting in automated ontology extension. The proposed models achieved an overall F1 scores of 0.80 and above, improvements of at least 6 percentage points over our previous results on the same dataset. In addition, the models are interpretable: we illustrate that visualizing the model’s attention weights can help to explain the results by providing insight into how the model made its decisions. We also analyse the performance for molecules that have not been part of the ontology and evaluate the logical correctness of the resulting extension.

查看原文本刊更多论文

化学中的可解释本体扩展

参考本体为其领域提供共享词汇表和知识资源。手工构建和注释使它们能够保持高质量，从而使它们在社区中被广泛接受。然而，手工本体开发过程不适合大型领域。我们提出了一种新的本体自动扩展方法，用于本体类具有关联图结构注释的领域，并将其应用于生命科学化学领域的重要参考本体ChEBI本体。我们在ChEBI本体及其所属类的叶节点结构上训练基于transformer的深度学习模型。然后，模型能够自动对以前未见过的化学结构进行分类，从而实现自动本体扩展。所提出的模型实现了0.80及以上的总体F1分数，比我们之前在相同数据集上的结果至少提高了6个百分点。此外，模型是可解释的:我们说明了可视化模型的注意力权重可以通过洞察模型如何做出决策来帮助解释结果。我们还分析了不属于本体的分子的性能，并评估了结果扩展的逻辑正确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Semantic Web COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCEC-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

8.30

自引率

6.70%

发文量

期刊介绍： The journal Semantic Web – Interoperability, Usability, Applicability brings together researchers from various fields which share the vision and need for more effective and meaningful ways to share information across agents and services on the future internet and elsewhere. As such, Semantic Web technologies shall support the seamless integration of data, on-the-fly composition and interoperation of Web services, as well as more intuitive search engines. The semantics – or meaning – of information, however, cannot be defined without a context, which makes personalization, trust, and provenance core topics for Semantic Web research. New retrieval paradigms, user interfaces, and visualization techniques have to unleash the power of the Semantic Web and at the same time hide its complexity from the user. Based on this vision, the journal welcomes contributions ranging from theoretical and foundational research over methods and tools to descriptions of concrete ontologies and applications in all areas. We especially welcome papers which add a social, spatial, and temporal dimension to Semantic Web research, as well as application-oriented papers making use of formal semantics.