Linked Open Research Data for Social Science A Concept Registry for Granular Data Documentation

Pascal Siegers, Antonia C. May, Claudia Saalbach, Jana Nebelin, Dagmar Kern, Andreas Daniel, Ben Zapilko, Fakhri Momeni, Knut Wenzig, Jan Goebel
{"title":"Linked Open Research Data for Social Science A Concept Registry for Granular Data Documentation","authors":"Pascal Siegers, Antonia C. May, Claudia Saalbach, Jana Nebelin, Dagmar Kern, Andreas Daniel, Ben Zapilko, Fakhri Momeni, Knut Wenzig, Jan Goebel","doi":"10.52825/cordi.v1i.300","DOIUrl":null,"url":null,"abstract":"The re-use of research data is an integral part of research practice in the social and economic sciences. To find relevant data, researchers need adequate search facilities. However, a comprehensive, thematic search for research data is made more difficult by inconsistent or missing semantic indexing of data at the level of social science concepts (e.g., representing the theory language). Either the data is not documented at a granular level, or primary investigators use their ad-hoc terminology to describe their data. Consequently, researchers have to make great efforts to find relevant or comparable data. From the user's perspective, the lack of theory language in data documentation impedes effective data searches and thus significantly limits the research potential of existing data collections. Because there is currently no semantic model for indexing the data content, the specific challenge for improving data search lies in establishing concept-based indexing of research data. Research infrastructures need technology for the harmonized semantic indexing of their research data. The LORD concept registry aims at closing this gap by developing a registry of sociological and economic concepts and, following the FAIR principles, making this concept registry generally available to the scientific community. As a first step, we developed a basic data model for the Concept Registry using United Modeling Language (UML). All links between are created and managed in the form of so-called RDF triples. An annotation application allows for linking specific questions/variables to concepts. The application also includes the two SKOS-compliant thesauri, \"Thesaurus Social Sciences\" (TheSoz) and \"Standard Thesaurus Economics\" (STW) but could be extended to other resources like ELSST. \nWe illustrate the application of the LORD concept registry with examples from three large-scale survey programmes (German Socio-Economic Panel, German General Social Survey, National Academics Panel Study). The initial focus is on variables and questions with overlapping content in the three survey programmes, as they form a sound basis for cross-linking with concepts.","PeriodicalId":359879,"journal":{"name":"Proceedings of the Conference on Research Data Infrastructure","volume":"215 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on Research Data Infrastructure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.52825/cordi.v1i.300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The re-use of research data is an integral part of research practice in the social and economic sciences. To find relevant data, researchers need adequate search facilities. However, a comprehensive, thematic search for research data is made more difficult by inconsistent or missing semantic indexing of data at the level of social science concepts (e.g., representing the theory language). Either the data is not documented at a granular level, or primary investigators use their ad-hoc terminology to describe their data. Consequently, researchers have to make great efforts to find relevant or comparable data. From the user's perspective, the lack of theory language in data documentation impedes effective data searches and thus significantly limits the research potential of existing data collections. Because there is currently no semantic model for indexing the data content, the specific challenge for improving data search lies in establishing concept-based indexing of research data. Research infrastructures need technology for the harmonized semantic indexing of their research data. The LORD concept registry aims at closing this gap by developing a registry of sociological and economic concepts and, following the FAIR principles, making this concept registry generally available to the scientific community. As a first step, we developed a basic data model for the Concept Registry using United Modeling Language (UML). All links between are created and managed in the form of so-called RDF triples. An annotation application allows for linking specific questions/variables to concepts. The application also includes the two SKOS-compliant thesauri, "Thesaurus Social Sciences" (TheSoz) and "Standard Thesaurus Economics" (STW) but could be extended to other resources like ELSST. We illustrate the application of the LORD concept registry with examples from three large-scale survey programmes (German Socio-Economic Panel, German General Social Survey, National Academics Panel Study). The initial focus is on variables and questions with overlapping content in the three survey programmes, as they form a sound basis for cross-linking with concepts.
链接开放研究数据的社会科学。一个概念注册的颗粒数据文件
研究数据的再利用是社会和经济科学研究实践的一个组成部分。为了找到相关的数据,研究人员需要足够的搜索设备。然而,由于在社会科学概念层面(例如,表示理论语言)对数据的语义索引不一致或缺失,对研究数据进行全面的专题搜索变得更加困难。要么数据没有在粒度级别上进行记录,要么主要调查人员使用他们的专门术语来描述他们的数据。因此,研究人员必须付出很大的努力来寻找相关的或可比较的数据。从用户的角度来看,数据文档中缺乏理论语言阻碍了有效的数据搜索,从而极大地限制了现有数据收集的研究潜力。由于目前还没有用于索引数据内容的语义模型,因此改进数据搜索的具体挑战在于建立基于概念的研究数据索引。研究基础设施需要对其研究数据进行统一语义索引的技术。LORD概念登记册旨在通过建立社会学和经济学概念登记册,并遵循FAIR原则,使这一概念登记册普遍提供给科学界,从而缩小这一差距。作为第一步,我们使用统一建模语言(UML)为概念注册中心开发了一个基本的数据模型。它们之间的所有链接都以所谓的RDF三元组的形式创建和管理。注释应用程序允许将特定的问题/变量链接到概念。该应用程序还包括两个符合skos标准的词库,“社会科学词库”(TheSoz)和“标准经济词库”(STW),但可以扩展到其他资源,如ELSST。我们用三个大型调查项目(德国社会经济小组、德国综合社会调查、国家学术小组研究)的例子来说明LORD概念注册表的应用。最初的重点是三个调查方案中内容重叠的变量和问题,因为它们构成了与概念交叉联系的良好基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信