Improving Findability of Open Data Beyond Data Catalogs

T. Skopal, Jakub Klímek, M. Nečaský
{"title":"Improving Findability of Open Data Beyond Data Catalogs","authors":"T. Skopal, Jakub Klímek, M. Nečaský","doi":"10.1145/3366030.3366095","DOIUrl":null,"url":null,"abstract":"There is a vast amount of datasets available as Open Data on the Web. However, it is challenging for consumers to find datasets relevant to their goals. This is because the available metadata in catalogs is not descriptive enough. Nevertheless, datasets exist in various types of contexts not expressed in the metadata. These may include information about the data publisher, the legislation related to dataset publication, etc. In this paper we describe an idea of a data model that enables consumers to better understand the data. We propose to define a formal model for representation of the datasets and their contexts, and we propose to apply existing similarity techniques, adjust them to fit each identified dataset context type and combine them together to measure similarity of datasets in new ways, improving their findability.","PeriodicalId":446280,"journal":{"name":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366030.3366095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

There is a vast amount of datasets available as Open Data on the Web. However, it is challenging for consumers to find datasets relevant to their goals. This is because the available metadata in catalogs is not descriptive enough. Nevertheless, datasets exist in various types of contexts not expressed in the metadata. These may include information about the data publisher, the legislation related to dataset publication, etc. In this paper we describe an idea of a data model that enables consumers to better understand the data. We propose to define a formal model for representation of the datasets and their contexts, and we propose to apply existing similarity techniques, adjust them to fit each identified dataset context type and combine them together to measure similarity of datasets in new ways, improving their findability.
提高开放数据在数据目录之外的可查找性
在网络上有大量的数据集作为开放数据可用。然而,消费者很难找到与他们的目标相关的数据集。这是因为编目中可用的元数据没有足够的描述性。然而,数据集存在于各种类型的上下文中,而不是在元数据中表示。这些信息可能包括有关数据发布者的信息、与数据集出版相关的立法等。在本文中,我们描述了一个数据模型的概念,它使消费者能够更好地理解数据。我们建议为数据集及其上下文的表示定义一个形式化模型,并建议应用现有的相似性技术,调整它们以适应每种识别的数据集上下文类型,并将它们组合在一起以新的方式测量数据集的相似性,从而提高它们的可查找性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信