An information content based partitioning method for the anatomical ontology matching task

Proceedings of the 3rd Symposium on Information and Communication Technology Pub Date : 2012-08-23 DOI:10.1145/2350716.2350757

Dac-Thanh Tran, Duy-Hoa Ngo, Phan-Thuan Do

{"title":"An information content based partitioning method for the anatomical ontology matching task","authors":"Dac-Thanh Tran, Duy-Hoa Ngo, Phan-Thuan Do","doi":"10.1145/2350716.2350757","DOIUrl":null,"url":null,"abstract":"Anatomy ontology matching has been attracting a lot of interest and attention of researchers, especially, biologists, medics and geneticists. This is a very difficult task due to the huge size of anatomy ontologies. Despite the fact that many ontology matching tools have been proposed so far, most of them achieve good results only for small size ontologies. In a recent survey [22], the authors pointed out that the large scale ontology matching problem still presents a real challenge because it is a time consuming and memory intensive process. According to state of the art works, the authors also state that partitioning large scale ontology is a promising solution to deal with this issue. Therefore, in this paper, we propose a partitioning approach to break up the large matching problem into smaller matching subproblems. At first, we propose a method to semantically split anatomy ontology into groups called clusters. It relies on a specific method for computing semantic similarities between concepts based on both their information content on anatomy ontology, and a scalable agglomerative hierarchical clustering algorithm. We then propose a filtering method to select the possible similar partitions in order to reduce the computation time. The experimental analysis demonstrates that our approach is capable of solving the scalability ontology matching problem and encourages us to the future works.","PeriodicalId":208300,"journal":{"name":"Proceedings of the 3rd Symposium on Information and Communication Technology","volume":"438 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2350716.2350757","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Anatomy ontology matching has been attracting a lot of interest and attention of researchers, especially, biologists, medics and geneticists. This is a very difficult task due to the huge size of anatomy ontologies. Despite the fact that many ontology matching tools have been proposed so far, most of them achieve good results only for small size ontologies. In a recent survey [22], the authors pointed out that the large scale ontology matching problem still presents a real challenge because it is a time consuming and memory intensive process. According to state of the art works, the authors also state that partitioning large scale ontology is a promising solution to deal with this issue. Therefore, in this paper, we propose a partitioning approach to break up the large matching problem into smaller matching subproblems. At first, we propose a method to semantically split anatomy ontology into groups called clusters. It relies on a specific method for computing semantic similarities between concepts based on both their information content on anatomy ontology, and a scalable agglomerative hierarchical clustering algorithm. We then propose a filtering method to select the possible similar partitions in order to reduce the computation time. The experimental analysis demonstrates that our approach is capable of solving the scalability ontology matching problem and encourages us to the future works.

查看原文本刊更多论文

一种基于信息内容的解剖本体匹配划分方法

解剖学本体匹配已经引起了许多研究者的兴趣和关注，特别是生物学家、医学家和遗传学家。由于解剖本体的巨大尺寸，这是一项非常困难的任务。尽管目前已经提出了许多本体匹配工具，但大多数工具仅在小型本体上取得了良好的效果。在最近的一项调查中[22]，作者指出大规模的本体匹配问题仍然是一个真正的挑战，因为它是一个耗时和内存密集型的过程。根据目前的研究现状，提出了对大规模本体进行划分是解决这一问题的一种很有前途的方法。因此，在本文中，我们提出了一种划分方法，将大匹配问题分解为较小的匹配子问题。首先，我们提出了一种将解剖学本体在语义上划分为聚类的方法。它依赖于一种特定的方法来计算概念之间的语义相似度，该方法基于概念在解剖学本体上的信息内容，以及一种可扩展的聚类分层聚类算法。然后，我们提出了一种过滤方法来选择可能的相似分区，以减少计算时间。实验分析表明，该方法能够很好地解决可扩展性本体匹配问题，并对今后的工作起到了鼓励作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 3rd Symposium on Information and Communication Technology

自引率

0.00%

发文量