SMAT: An attention-based deep learning solution to the automation of schema matching.

Jing Zhang, Bonggun Shin, Jinho D Choi, Joyce C Ho
{"title":"SMAT: An attention-based deep learning solution to the automation of schema matching.","authors":"Jing Zhang,&nbsp;Bonggun Shin,&nbsp;Jinho D Choi,&nbsp;Joyce C Ho","doi":"10.1007/978-3-030-82472-3_19","DOIUrl":null,"url":null,"abstract":"<p><p>Schema matching aims to identify the correspondences among attributes of database schemas. It is frequently considered as the most challenging and decisive stage existing in many contemporary web semantics and database systems. Low-quality algorithmic matchers fail to provide improvement while manually annotation consumes extensive human efforts. Further complications arise from data privacy in certain domains such as healthcare, where only schema-level matching should be used to prevent data leakage. For this problem, we propose SMAT, a new deep learning model based on state-of-the-art natural language processing techniques to obtain semantic mappings between source and target schemas using only the attribute name and description. SMAT avoids directly encoding domain knowledge about the source and target systems, which allows it to be more easily deployed across different sites. We also introduce a new benchmark dataset, OMAP, based on real-world schema-level mappings from the healthcare domain. Our extensive evaluation of various benchmark datasets demonstrates the potential of SMAT to help automate schema-level matching tasks.</p>","PeriodicalId":93244,"journal":{"name":"Advances in databases and information systems. ADBIS","volume":"12843 ","pages":"260-274"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8487677/pdf/nihms-1722415.pdf","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in databases and information systems. ADBIS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-030-82472-3_19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/8/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Schema matching aims to identify the correspondences among attributes of database schemas. It is frequently considered as the most challenging and decisive stage existing in many contemporary web semantics and database systems. Low-quality algorithmic matchers fail to provide improvement while manually annotation consumes extensive human efforts. Further complications arise from data privacy in certain domains such as healthcare, where only schema-level matching should be used to prevent data leakage. For this problem, we propose SMAT, a new deep learning model based on state-of-the-art natural language processing techniques to obtain semantic mappings between source and target schemas using only the attribute name and description. SMAT avoids directly encoding domain knowledge about the source and target systems, which allows it to be more easily deployed across different sites. We also introduce a new benchmark dataset, OMAP, based on real-world schema-level mappings from the healthcare domain. Our extensive evaluation of various benchmark datasets demonstrates the potential of SMAT to help automate schema-level matching tasks.

Abstract Image

Abstract Image

SMAT:一种基于注意力的模式匹配自动化深度学习解决方案。
模式匹配的目的是识别数据库模式属性之间的对应关系。它经常被认为是存在于许多当代网络语义和数据库系统中最具挑战性和决定性的阶段。低质量的算法匹配器无法提供改进,而手动注释消耗了大量的人力。某些领域(如医疗保健领域)的数据隐私会带来进一步的复杂性,在这些领域中,应该只使用模式级匹配来防止数据泄漏。针对这个问题,我们提出了一种新的深度学习模型SMAT,该模型基于最先进的自然语言处理技术,仅使用属性名称和描述即可获得源模式和目标模式之间的语义映射。SMAT避免直接编码关于源系统和目标系统的领域知识,这使得它可以更容易地跨不同的站点部署。我们还介绍了一个新的基准数据集OMAP,它基于来自医疗保健领域的实际模式级映射。我们对各种基准数据集进行了广泛的评估,证明了SMAT在帮助自动化模式级匹配任务方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信