利用RDF超图挖掘生物医学本体和数据

Haishan Liu, D. Dou, R. Jin, P. LePendu, N. Shah
{"title":"利用RDF超图挖掘生物医学本体和数据","authors":"Haishan Liu, D. Dou, R. Jin, P. LePendu, N. Shah","doi":"10.1109/ICMLA.2013.31","DOIUrl":null,"url":null,"abstract":"As researchers analyze huge amounts of data that are annotated by large biomedical ontologies, one of the major challenges for data mining and machine learning is to leverage both ontologies and data together in a systematic and scalable way. In this paper, we address two interesting and related problems for mining biomedical ontologies and data: i) how to discover semantic associations with the help of formal ontologies, ii) how to identify potential errors in the ontologies with the help of data. By representing both ontologies and data using RDF hyper graphs, and subsequently transforming the hyper graphs to corresponding bipartite forms, we provide a generalized data mining method that scales beyond what existing ontology-based approaches can provide. We show the proposed method is indeed capable of capturing semantic associations while seamlessly incorporate domain knowledge in ontologies by performing evaluations on real-world electronic health dataset and NCBO ontologies. We also show that our data mining methods can discover and suggest corrections for misinformation in biomedical ontologies.","PeriodicalId":168867,"journal":{"name":"2013 12th International Conference on Machine Learning and Applications","volume":"13 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"Mining Biomedical Ontologies and Data Using RDF Hypergraphs\",\"authors\":\"Haishan Liu, D. Dou, R. Jin, P. LePendu, N. Shah\",\"doi\":\"10.1109/ICMLA.2013.31\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As researchers analyze huge amounts of data that are annotated by large biomedical ontologies, one of the major challenges for data mining and machine learning is to leverage both ontologies and data together in a systematic and scalable way. In this paper, we address two interesting and related problems for mining biomedical ontologies and data: i) how to discover semantic associations with the help of formal ontologies, ii) how to identify potential errors in the ontologies with the help of data. By representing both ontologies and data using RDF hyper graphs, and subsequently transforming the hyper graphs to corresponding bipartite forms, we provide a generalized data mining method that scales beyond what existing ontology-based approaches can provide. We show the proposed method is indeed capable of capturing semantic associations while seamlessly incorporate domain knowledge in ontologies by performing evaluations on real-world electronic health dataset and NCBO ontologies. We also show that our data mining methods can discover and suggest corrections for misinformation in biomedical ontologies.\",\"PeriodicalId\":168867,\"journal\":{\"name\":\"2013 12th International Conference on Machine Learning and Applications\",\"volume\":\"13 6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 12th International Conference on Machine Learning and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2013.31\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 12th International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2013.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26

摘要

当研究人员分析由大型生物医学本体注释的大量数据时,数据挖掘和机器学习的主要挑战之一是以系统和可扩展的方式同时利用本体和数据。在本文中,我们解决了挖掘生物医学本体和数据的两个有趣且相关的问题:i)如何在形式本体的帮助下发现语义关联,ii)如何在数据的帮助下识别本体中的潜在错误。通过使用RDF超图表示本体和数据,并随后将超图转换为相应的二部形式,我们提供了一种通用的数据挖掘方法,其扩展范围超出了现有基于本体的方法所能提供的范围。通过对现实世界的电子健康数据集和NCBO本体进行评估,我们证明了所提出的方法确实能够捕获语义关联,同时无缝地将领域知识整合到本体中。我们还表明,我们的数据挖掘方法可以发现并建议纠正生物医学本体中的错误信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Mining Biomedical Ontologies and Data Using RDF Hypergraphs
As researchers analyze huge amounts of data that are annotated by large biomedical ontologies, one of the major challenges for data mining and machine learning is to leverage both ontologies and data together in a systematic and scalable way. In this paper, we address two interesting and related problems for mining biomedical ontologies and data: i) how to discover semantic associations with the help of formal ontologies, ii) how to identify potential errors in the ontologies with the help of data. By representing both ontologies and data using RDF hyper graphs, and subsequently transforming the hyper graphs to corresponding bipartite forms, we provide a generalized data mining method that scales beyond what existing ontology-based approaches can provide. We show the proposed method is indeed capable of capturing semantic associations while seamlessly incorporate domain knowledge in ontologies by performing evaluations on real-world electronic health dataset and NCBO ontologies. We also show that our data mining methods can discover and suggest corrections for misinformation in biomedical ontologies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信