Ontology-based semantic data interestingness using BERT models

IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Abhilash C. Basavaraju, K. Mahesh, Nihar Sanda
{"title":"Ontology-based semantic data interestingness using BERT models","authors":"Abhilash C. Basavaraju, K. Mahesh, Nihar Sanda","doi":"10.1080/09540091.2023.2190499","DOIUrl":null,"url":null,"abstract":"The COVID-19 pandemic has generated massive data in the healthcare sector in recent years, encouraging researchers and scientists to uncover the underlying facts. Mining interesting patterns in the large COVID-19 corpora is very important and useful for the decision makers. This paper presents a novel approach for uncovering interesting insights in large datasets using ontologies and BERT models. The research proposes a framework for extracting semantically rich facts from data by incorporating domain knowledge into the data mining process through the use of ontologies. An improved Apriori algorithm is employed for mining semantic association rules, while the interestingness of the rules is evaluated using BERT models for semantic richness. The results of the proposed framework are compared with state-of-the-art methods and evaluated using a combination of domain expert evaluation and statistical significance testing. The study offers a promising solution for finding meaningful relationships and facts in large datasets, particularly in the healthcare sector. © 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.","PeriodicalId":50629,"journal":{"name":"Connection Science","volume":"5 1","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Connection Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/09540091.2023.2190499","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 1

Abstract

The COVID-19 pandemic has generated massive data in the healthcare sector in recent years, encouraging researchers and scientists to uncover the underlying facts. Mining interesting patterns in the large COVID-19 corpora is very important and useful for the decision makers. This paper presents a novel approach for uncovering interesting insights in large datasets using ontologies and BERT models. The research proposes a framework for extracting semantically rich facts from data by incorporating domain knowledge into the data mining process through the use of ontologies. An improved Apriori algorithm is employed for mining semantic association rules, while the interestingness of the rules is evaluated using BERT models for semantic richness. The results of the proposed framework are compared with state-of-the-art methods and evaluated using a combination of domain expert evaluation and statistical significance testing. The study offers a promising solution for finding meaningful relationships and facts in large datasets, particularly in the healthcare sector. © 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
利用BERT模型实现基于本体的语义数据兴趣
近年来,COVID-19大流行在医疗保健领域产生了大量数据,鼓励研究人员和科学家揭示潜在的事实。在大型COVID-19语料库中挖掘有趣的模式对决策者来说非常重要和有用。本文提出了一种利用本体和BERT模型在大型数据集中发现有趣见解的新方法。该研究提出了一个框架,通过使用本体将领域知识纳入数据挖掘过程,从数据中提取语义丰富的事实。采用改进的Apriori算法挖掘语义关联规则,并利用BERT模型评估规则的兴趣度。将提出的框架的结果与最先进的方法进行比较,并使用领域专家评估和统计显著性检验相结合的方法进行评估。该研究为在大型数据集中寻找有意义的关系和事实提供了一个有希望的解决方案,特别是在医疗保健部门。©2023作者。由Informa UK Limited出版,以Taylor & Francis Group的名义进行交易。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Connection Science
Connection Science 工程技术-计算机:理论方法
CiteScore
6.50
自引率
39.60%
发文量
94
审稿时长
3 months
期刊介绍: Connection Science is an interdisciplinary journal dedicated to exploring the convergence of the analytic and synthetic sciences, including neuroscience, computational modelling, artificial intelligence, machine learning, deep learning, Database, Big Data, quantum computing, Blockchain, Zero-Knowledge, Internet of Things, Cybersecurity, and parallel and distributed computing. A strong focus is on the articles arising from connectionist, probabilistic, dynamical, or evolutionary approaches in aspects of Computer Science, applied applications, and systems-level computational subjects that seek to understand models in science and engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信