用于评估遗留信息集成方法的测试工具

J. Hammer, M. Stonebraker, Oguzhan Topsakal
{"title":"用于评估遗留信息集成方法的测试工具","authors":"J. Hammer, M. Stonebraker, Oguzhan Topsakal","doi":"10.1109/ICDE.2005.140","DOIUrl":null,"url":null,"abstract":"We introduce our new, publicly available testbed and benchmark called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches) for testing and evaluating integration technologies. THALIA provides researchers with a collection of 40 downloadable data sources representing University course catalogs from computer science departments worldwide. In addition, THALIA currently provides a set of twelve challenge queries as well as a scoring function for ranking the performance of an integration system. A second contribution is a systematic classification of the types of syntactic and semantic heterogeneities, which directly lead to the twelve challenge. We have chosen course information as our domain of discourse because it is well known and easy to understand. Furthermore, there is an abundance of data sources publicly available that allowed us to develop a testbed exhibiting all of the syntactic and semantic heterogeneities that we have identified.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":"{\"title\":\"THALIA: Test Harness for the Assessment of Legacy Information Integration Approaches\",\"authors\":\"J. Hammer, M. Stonebraker, Oguzhan Topsakal\",\"doi\":\"10.1109/ICDE.2005.140\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce our new, publicly available testbed and benchmark called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches) for testing and evaluating integration technologies. THALIA provides researchers with a collection of 40 downloadable data sources representing University course catalogs from computer science departments worldwide. In addition, THALIA currently provides a set of twelve challenge queries as well as a scoring function for ranking the performance of an integration system. A second contribution is a systematic classification of the types of syntactic and semantic heterogeneities, which directly lead to the twelve challenge. We have chosen course information as our domain of discourse because it is well known and easy to understand. Furthermore, there is an abundance of data sources publicly available that allowed us to develop a testbed exhibiting all of the syntactic and semantic heterogeneities that we have identified.\",\"PeriodicalId\":297231,\"journal\":{\"name\":\"21st International Conference on Data Engineering (ICDE'05)\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"58\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"21st International Conference on Data Engineering (ICDE'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2005.140\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"21st International Conference on Data Engineering (ICDE'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2005.140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 58

摘要

我们引入了新的、公开可用的测试平台和基准,称为THALIA(用于评估遗留信息集成方法的测试工具),用于测试和评估集成技术。THALIA为研究人员提供了40个可下载的数据源,这些数据源代表了全球计算机科学系的大学课程目录。此外,THALIA目前提供了一组12个挑战查询以及一个评分功能,用于对集成系统的性能进行排名。第二个贡献是对句法和语义异构类型的系统分类,这直接导致了第12个挑战。我们选择课程信息作为我们的话语领域,因为它是众所周知的,易于理解。此外,有大量公开可用的数据源,使我们能够开发一个测试平台,展示我们已经确定的所有语法和语义异构性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
THALIA: Test Harness for the Assessment of Legacy Information Integration Approaches
We introduce our new, publicly available testbed and benchmark called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches) for testing and evaluating integration technologies. THALIA provides researchers with a collection of 40 downloadable data sources representing University course catalogs from computer science departments worldwide. In addition, THALIA currently provides a set of twelve challenge queries as well as a scoring function for ranking the performance of an integration system. A second contribution is a systematic classification of the types of syntactic and semantic heterogeneities, which directly lead to the twelve challenge. We have chosen course information as our domain of discourse because it is well known and easy to understand. Furthermore, there is an abundance of data sources publicly available that allowed us to develop a testbed exhibiting all of the syntactic and semantic heterogeneities that we have identified.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信