在Apache Spark上启用RETE算法进行RDFS推理

H. Ju, Sangyoon Oh
{"title":"在Apache Spark上启用RETE算法进行RDFS推理","authors":"H. Ju, Sangyoon Oh","doi":"10.1109/SC2.2018.00028","DOIUrl":null,"url":null,"abstract":"Semantic web technology has been used to help various software, including Intelligence Personal Assistant, by acquiring new data or understanding the knowledge through relations between data. However, it is hard to apply the current semantic web schemes such as RDFS reasoning to the real world data because of huge volume of data need to be processed. In this study, we design and enable RDFS reasoning with RETE algorithm on Apache Spark in parallel fashion. In addition, we apply rule sequence optimization ordering from existing studies to enhance the processing performance. From the empirical experiment results, we verified that the implementation of our design shows a strong scalability. However, the current naïve approach of using Spark provided distinct function to deduplicate data should be improved to yield a better processing performance. In future studies, we will study further to find new deduplication method.","PeriodicalId":340244,"journal":{"name":"2018 IEEE 8th International Symposium on Cloud and Service Computing (SC2)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Enabling RETE Algorithm for RDFS Reasoning on Apache Spark\",\"authors\":\"H. Ju, Sangyoon Oh\",\"doi\":\"10.1109/SC2.2018.00028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic web technology has been used to help various software, including Intelligence Personal Assistant, by acquiring new data or understanding the knowledge through relations between data. However, it is hard to apply the current semantic web schemes such as RDFS reasoning to the real world data because of huge volume of data need to be processed. In this study, we design and enable RDFS reasoning with RETE algorithm on Apache Spark in parallel fashion. In addition, we apply rule sequence optimization ordering from existing studies to enhance the processing performance. From the empirical experiment results, we verified that the implementation of our design shows a strong scalability. However, the current naïve approach of using Spark provided distinct function to deduplicate data should be improved to yield a better processing performance. In future studies, we will study further to find new deduplication method.\",\"PeriodicalId\":340244,\"journal\":{\"name\":\"2018 IEEE 8th International Symposium on Cloud and Service Computing (SC2)\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 8th International Symposium on Cloud and Service Computing (SC2)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC2.2018.00028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 8th International Symposium on Cloud and Service Computing (SC2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC2.2018.00028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

语义网技术已被用于帮助各种软件,包括智能个人助理,通过获取新的数据或通过数据之间的关系来理解知识。然而,由于需要处理大量的数据,目前的语义web方案(如RDFS推理)很难应用于现实世界的数据。在本研究中,我们以并行方式在Apache Spark上设计并启用了使用RETE算法的RDFS推理。此外,我们还应用已有研究中的规则序列优化排序来提高处理性能。从实证实验结果来看,我们验证了我们设计的实现具有较强的可扩展性。但是,目前使用Spark提供不同功能来重复数据删除的naïve方法应该得到改进,以获得更好的处理性能。在今后的研究中,我们将进一步研究寻找新的重复数据删除方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enabling RETE Algorithm for RDFS Reasoning on Apache Spark
Semantic web technology has been used to help various software, including Intelligence Personal Assistant, by acquiring new data or understanding the knowledge through relations between data. However, it is hard to apply the current semantic web schemes such as RDFS reasoning to the real world data because of huge volume of data need to be processed. In this study, we design and enable RDFS reasoning with RETE algorithm on Apache Spark in parallel fashion. In addition, we apply rule sequence optimization ordering from existing studies to enhance the processing performance. From the empirical experiment results, we verified that the implementation of our design shows a strong scalability. However, the current naïve approach of using Spark provided distinct function to deduplicate data should be improved to yield a better processing performance. In future studies, we will study further to find new deduplication method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信