Learning Permutation-Invariant Embeddings for Description Logic Concepts

Advances in intelligent data analysis. International Symposium on Intelligent Data Analysis Pub Date : 2023-03-03 DOI:10.48550/arXiv.2303.01844

Caglar Demir, A. N. Ngomo

{"title":"Learning Permutation-Invariant Embeddings for Description Logic Concepts","authors":"Caglar Demir, A. N. Ngomo","doi":"10.48550/arXiv.2303.01844","DOIUrl":null,"url":null,"abstract":"Concept learning deals with learning description logic concepts from a background knowledge and input examples. The goal is to learn a concept that covers all positive examples, while not covering any negative examples. This non-trivial task is often formulated as a search problem within an infinite quasi-ordered concept space. Although state-of-the-art models have been successfully applied to tackle this problem, their large-scale applications have been severely hindered due to their excessive exploration incurring impractical runtimes. Here, we propose a remedy for this limitation. We reformulate the learning problem as a multi-label classification problem and propose a neural embedding model (NERO) that learns permutation-invariant embeddings for sets of examples tailored towards predicting $F_1$ scores of pre-selected description logic concepts. By ranking such concepts in descending order of predicted scores, a possible goal concept can be detected within few retrieval operations, i.e., no excessive exploration. Importantly, top-ranked concepts can be used to start the search procedure of state-of-the-art symbolic models in multiple advantageous regions of a concept space, rather than starting it in the most general concept $\\top$. Our experiments on 5 benchmark datasets with 770 learning problems firmly suggest that NERO significantly (p-value<1%) outperforms the state-of-the-art models in terms of $F_1$ score, the number of explored concepts, and the total runtime. We provide an open-source implementation of our approach.","PeriodicalId":91439,"journal":{"name":"Advances in intelligent data analysis. International Symposium on Intelligent Data Analysis","volume":"9 1","pages":"103-115"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in intelligent data analysis. International Symposium on Intelligent Data Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2303.01844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Concept learning deals with learning description logic concepts from a background knowledge and input examples. The goal is to learn a concept that covers all positive examples, while not covering any negative examples. This non-trivial task is often formulated as a search problem within an infinite quasi-ordered concept space. Although state-of-the-art models have been successfully applied to tackle this problem, their large-scale applications have been severely hindered due to their excessive exploration incurring impractical runtimes. Here, we propose a remedy for this limitation. We reformulate the learning problem as a multi-label classification problem and propose a neural embedding model (NERO) that learns permutation-invariant embeddings for sets of examples tailored towards predicting $F_1$ scores of pre-selected description logic concepts. By ranking such concepts in descending order of predicted scores, a possible goal concept can be detected within few retrieval operations, i.e., no excessive exploration. Importantly, top-ranked concepts can be used to start the search procedure of state-of-the-art symbolic models in multiple advantageous regions of a concept space, rather than starting it in the most general concept $\top$. Our experiments on 5 benchmark datasets with 770 learning problems firmly suggest that NERO significantly (p-value<1%) outperforms the state-of-the-art models in terms of $F_1$ score, the number of explored concepts, and the total runtime. We provide an open-source implementation of our approach.

查看原文本刊更多论文

描述逻辑概念的置换不变嵌入学习

概念学习涉及从背景知识和输入示例中学习描述逻辑概念。目标是学习一个概念，涵盖所有积极的例子，而不包括任何消极的例子。这个非平凡的任务通常被表述为一个无限准有序概念空间中的搜索问题。尽管最先进的模型已经成功地应用于解决这个问题，但由于过度的探索导致不切实际的运行时间，它们的大规模应用受到了严重阻碍。在这里，我们提出了一种补救方法。我们将学习问题重新表述为一个多标签分类问题，并提出了一个神经嵌入模型(NERO)，该模型可以学习用于预测预选描述逻辑概念的$F_1$分数的示例集的排列不变嵌入。通过将这些概念按照预测分数的降序排列，可以在很少的检索操作中检测到可能的目标概念，即不需要过多的探索。重要的是，排名靠前的概念可以用来在概念空间的多个有利区域启动最先进的符号模型的搜索过程，而不是从最一般的概念$\top$开始。我们在5个具有770个学习问题的基准数据集上进行的实验表明，NERO在$F_1$分数、探索的概念数量和总运行时间方面明显优于最先进的模型(p值<1%)。我们提供了我们方法的开源实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Advances in intelligent data analysis. International Symposium on Intelligent Data Analysis

自引率

0.00%

发文量