Semantic content management and integration with JCR/CMIS compliant content repositories

International Conference on Semantic Systems Pub Date : 2012-09-05 DOI:10.1145/2362499.2362527

Suat Gönül, Ali Anil Sinaci

{"title":"Semantic content management and integration with JCR/CMIS compliant content repositories","authors":"Suat Gönül, Ali Anil Sinaci","doi":"10.1145/2362499.2362527","DOIUrl":null,"url":null,"abstract":"Existing content management systems (CMSes) usually do not offer flexible, customizable means to create semantic, domain specific indexing and search mechanisms. Therefore, they either do not provide any semantic retrieval, search, browsing functionalities at all on the managed content or the semantic search functionality provided is limited as it depends on the manual annotation of content by users. So, in this study we describe a semantic content management flow by extracting implicit knowledge from both the structure of the CMSes and actual content within them. The task of additional semantic knowledge gathering and providing semantic operations on the content is a challenging task which includes adoption of several latest advancements in information extraction (IE), information retrieval (IR) and Semantic Web areas. In this study, we propose a new approach which provides automatic annotation of content managed in CMSes with the information retrieved from the Linked Open Data (LOD) cloud and several semantic operations on the content in terms of storage and search. We use a simple RDF path language to create custom indexes and retrive semantic knowledge from the LOD cloud suitable for specific use cases. All additional knowledge is materialized along with the actual content of document in dedicated indexes. This semantix indexing infrastructure allows semantically meaningful search facilities on top of it. We realize our approach in the scope of Apache Stanbol project, which is a subproject developed in the scope of IKS project, by focusing on document storage and retrieval. We evaluate our approach in healthcare domain with different domain ontologies (SNOMED/CT, ART, RXNORM) in addition to DBpedia as parts of LOD cloud which are used to annotate documents and content obtained from different health portals.","PeriodicalId":275036,"journal":{"name":"International Conference on Semantic Systems","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Semantic Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2362499.2362527","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Existing content management systems (CMSes) usually do not offer flexible, customizable means to create semantic, domain specific indexing and search mechanisms. Therefore, they either do not provide any semantic retrieval, search, browsing functionalities at all on the managed content or the semantic search functionality provided is limited as it depends on the manual annotation of content by users. So, in this study we describe a semantic content management flow by extracting implicit knowledge from both the structure of the CMSes and actual content within them. The task of additional semantic knowledge gathering and providing semantic operations on the content is a challenging task which includes adoption of several latest advancements in information extraction (IE), information retrieval (IR) and Semantic Web areas. In this study, we propose a new approach which provides automatic annotation of content managed in CMSes with the information retrieved from the Linked Open Data (LOD) cloud and several semantic operations on the content in terms of storage and search. We use a simple RDF path language to create custom indexes and retrive semantic knowledge from the LOD cloud suitable for specific use cases. All additional knowledge is materialized along with the actual content of document in dedicated indexes. This semantix indexing infrastructure allows semantically meaningful search facilities on top of it. We realize our approach in the scope of Apache Stanbol project, which is a subproject developed in the scope of IKS project, by focusing on document storage and retrieval. We evaluate our approach in healthcare domain with different domain ontologies (SNOMED/CT, ART, RXNORM) in addition to DBpedia as parts of LOD cloud which are used to annotate documents and content obtained from different health portals.

查看原文本刊更多论文

语义内容管理和与JCR/CMIS兼容的内容存储库的集成

现有的内容管理系统(cms)通常不提供灵活的、可定制的方法来创建语义的、特定于领域的索引和搜索机制。因此，它们要么根本不提供对托管内容的任何语义检索、搜索和浏览功能，要么提供的语义搜索功能受到限制，因为它依赖于用户对内容的手动注释。因此，在本研究中，我们通过从cms的结构和其中的实际内容中提取隐含知识来描述语义内容管理流程。附加语义知识的收集和对内容的语义操作是一项具有挑战性的任务，这包括采用信息提取(IE)、信息检索(IR)和语义网领域的一些最新进展。在本研究中，我们提出了一种新的方法，该方法利用从链接开放数据(LOD)云检索的信息，以及在存储和搜索方面对内容进行的几种语义操作，为cms管理的内容提供自动注释。我们使用一种简单的RDF路径语言来创建自定义索引，并从LOD云中检索适合特定用例的语义知识。所有额外的知识与文档的实际内容一起在专用索引中具体化。这种语义索引基础结构允许在其上使用语义上有意义的搜索工具。我们通过关注文档存储和检索，在Apache Stanbol项目范围内实现了我们的方法，该项目是IKS项目范围内开发的子项目。除了DBpedia之外，我们还使用不同的领域本体(SNOMED/CT、ART、RXNORM)在医疗保健领域评估我们的方法，并将其作为LOD云的一部分，用于注释从不同健康门户获得的文档和内容。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Semantic Systems

自引率

0.00%

发文量