A. Assaf, E. Louw, A. Senart, Corentin Follenfant, Raphael Troncy, David Trastour
{"title":"RUBIX: a framework for improving data integration with linked data","authors":"A. Assaf, E. Louw, A. Senart, Corentin Follenfant, Raphael Troncy, David Trastour","doi":"10.1145/2422604.2422607","DOIUrl":"https://doi.org/10.1145/2422604.2422607","url":null,"abstract":"With today's public data sets containing billions of data items, more and more companies are looking to integrate external data with their traditional enterprise data to improve business intelligence analysis. These distributed data sources however exhibit heterogeneous data formats and terminologies and may contain noisy data. In this paper, we present RUBIX, a novel framework that enables business users to semi-automatically perform data integration on potentially noisy tabular data. This framework offers an extension to Google Refine with novel schema matching algorithms leveraging Freebase rich types. First experiments show that using Linked Data to map cell values with instances and column headers with types improves significantly the quality of the matching results and therefore should lead to more informed decisions.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121616161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Auer, Theodore Dalamagas, H. Parkinson, F. Bancilhon, G. Flouris, Dimitris Sacharidis, P. Buneman, D. Kotzinos, Y. Stavrakas, V. Christophides, George Papastefanatos, Kostas Thiveos
{"title":"Diachronic linked data: towards long-term preservation of structured interrelated information","authors":"S. Auer, Theodore Dalamagas, H. Parkinson, F. Bancilhon, G. Flouris, Dimitris Sacharidis, P. Buneman, D. Kotzinos, Y. Stavrakas, V. Christophides, George Papastefanatos, Kostas Thiveos","doi":"10.1145/2422604.2422610","DOIUrl":"https://doi.org/10.1145/2422604.2422610","url":null,"abstract":"The Linked Data Paradigm is a promising technology for publishing, sharing, and connecting data on the Web, which provides new perspectives for data integration and interoperability. However, the proliferation of distributed, interconnected linked data sources on the Web poses significant new challenges for consistently managing the vast number of potentially large datasets and their interdependencies. In this article we focus on the key problem of preserving evolving structured interlinked data. We argue that a number of issues, which hinder applications and users, are related to the temporal aspect that is intrinsic in Linked Data. We present three use cases to motivate our approach, we discuss problems that occur, and propose a direction for a solution.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115902520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Publishing and linking transport data on the web: extended version","authors":"Julien Plu, F. Scharffe","doi":"10.1145/2422604.2422614","DOIUrl":"https://doi.org/10.1145/2422604.2422614","url":null,"abstract":"Without Linked Data, transport data is limited to applications exclusively around transport. In this paper, we present a workflow for publishing and linking transport data on the Web. So we will be able to develop transport applications and to add other features which will be created from other datasets. This will be possible because transport data will be linked to these datasets.\u0000 We apply this workflow to two datasets: NEPTUNE, a French standard describing a transport line, and Passim, a directory containing relevant information on transport services, in every French city.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131164396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}