研究数据管理工具和工作流:波尔图大学的实验工作

IASSIST quarterly Pub Date : 2018-07-18 DOI:10.29173/IQ925
Cristina Ribeiro, J. Silva, João Aguiar Castro, R. C. Amorim, J. C. Lopes, G. David
{"title":"研究数据管理工具和工作流:波尔图大学的实验工作","authors":"Cristina Ribeiro, J. Silva, João Aguiar Castro, R. C. Amorim, J. C. Lopes, G. David","doi":"10.29173/IQ925","DOIUrl":null,"url":null,"abstract":"Research datasets include all kinds of objects, from web pages to sensor data, and originate in every domain. Concerns with data generated in large projects and well-funded research areas are centered on their exploration and analysis. For data in the long tail, the main issues are still how to get data visible, satisfactorily described, preserved, and searchable. \nOur work aims to promote data publication in research institutions, considering that researchers are the core stakeholders and need straightforward workflows, and that multi-disciplinary tools can be designed and adapted to specific areas with a reasonable effort. For small groups with interesting datasets but not much time or funding for data curation, we have to focus on engaging researchers in the process of preparing data for publication, while providing them with measurable outputs. In larger groups, solutions have to be customized to satisfy the requirements of more specific research contexts. \nWe describe our experience at the University of Porto in two lines of enquiry. For the work with long-tail groups we propose general-purpose tools for data description and the interface to multi-disciplinary data repositories. For areas with larger projects and more specific requirements, namely wind infrastructure, sensor data from concrete structures and marine data, we define specialized workflows. In both cases, we present a preliminary evaluation of results and an estimate of the kind of effort required to keep the proposed infrastructures running.  \nThe tools available to researchers can be decisive for their commitment. We focus on data preparation, namely on dataset organization and metadata creation. For groups in the long tail, we propose Dendro, an open-source research data management platform, and explore automatic metadata creation with LabTablet, an electronic laboratory notebook. For groups demanding a domain-specific approach, our analysis has resulted in the development of models and applications to organize the data and support some of their use cases. Overall, we have adopted ontologies for metadata modeling, keeping in sight metadata dissemination as Linked Open Data.","PeriodicalId":84870,"journal":{"name":"IASSIST quarterly","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Research Data Management Tools and Workflows: Experimental Work at the University of Porto\",\"authors\":\"Cristina Ribeiro, J. Silva, João Aguiar Castro, R. C. Amorim, J. C. Lopes, G. David\",\"doi\":\"10.29173/IQ925\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Research datasets include all kinds of objects, from web pages to sensor data, and originate in every domain. Concerns with data generated in large projects and well-funded research areas are centered on their exploration and analysis. For data in the long tail, the main issues are still how to get data visible, satisfactorily described, preserved, and searchable. \\nOur work aims to promote data publication in research institutions, considering that researchers are the core stakeholders and need straightforward workflows, and that multi-disciplinary tools can be designed and adapted to specific areas with a reasonable effort. For small groups with interesting datasets but not much time or funding for data curation, we have to focus on engaging researchers in the process of preparing data for publication, while providing them with measurable outputs. In larger groups, solutions have to be customized to satisfy the requirements of more specific research contexts. \\nWe describe our experience at the University of Porto in two lines of enquiry. For the work with long-tail groups we propose general-purpose tools for data description and the interface to multi-disciplinary data repositories. For areas with larger projects and more specific requirements, namely wind infrastructure, sensor data from concrete structures and marine data, we define specialized workflows. In both cases, we present a preliminary evaluation of results and an estimate of the kind of effort required to keep the proposed infrastructures running.  \\nThe tools available to researchers can be decisive for their commitment. We focus on data preparation, namely on dataset organization and metadata creation. For groups in the long tail, we propose Dendro, an open-source research data management platform, and explore automatic metadata creation with LabTablet, an electronic laboratory notebook. For groups demanding a domain-specific approach, our analysis has resulted in the development of models and applications to organize the data and support some of their use cases. Overall, we have adopted ontologies for metadata modeling, keeping in sight metadata dissemination as Linked Open Data.\",\"PeriodicalId\":84870,\"journal\":{\"name\":\"IASSIST quarterly\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IASSIST quarterly\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29173/IQ925\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IASSIST quarterly","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29173/IQ925","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

研究数据集包括各种各样的对象,从网页到传感器数据,并且起源于每个领域。对大型项目和资金充足的研究领域产生的数据的关注集中在它们的探索和分析上。对于长尾中的数据,主要问题仍然是如何使数据可见、令人满意地描述、保存和可搜索。我们的工作旨在促进研究机构的数据出版,考虑到研究人员是核心利益相关者,需要简单的工作流程,并且可以通过合理的努力设计和适应特定领域的多学科工具。对于拥有有趣数据集但没有太多时间或资金进行数据管理的小组,我们必须专注于让研究人员参与准备发表数据的过程,同时为他们提供可衡量的产出。在较大的团队中,解决方案必须定制,以满足更具体的研究背景的要求。我们用两条线索来描述我们在波尔图大学的经历。对于长尾组的工作,我们提出了用于数据描述和多学科数据存储库接口的通用工具。对于大型项目和更具体要求的领域,即风力基础设施、混凝土结构的传感器数据和海洋数据,我们定义了专门的工作流程。在这两种情况下,我们对结果进行了初步评估,并对保持所建议的基础设施运行所需的工作量进行了估计。研究人员可用的工具对他们的承诺具有决定性作用。我们专注于数据准备,即数据集组织和元数据创建。对于处于长尾的群体,我们建议使用开源研究数据管理平台Dendro,并探索使用电子实验室笔记本LabTablet自动创建元数据。对于需要特定于领域的方法的小组,我们的分析导致了模型和应用程序的开发,以组织数据并支持它们的一些用例。总的来说,我们采用了元数据建模的本体,将元数据传播作为关联开放数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Research Data Management Tools and Workflows: Experimental Work at the University of Porto
Research datasets include all kinds of objects, from web pages to sensor data, and originate in every domain. Concerns with data generated in large projects and well-funded research areas are centered on their exploration and analysis. For data in the long tail, the main issues are still how to get data visible, satisfactorily described, preserved, and searchable. Our work aims to promote data publication in research institutions, considering that researchers are the core stakeholders and need straightforward workflows, and that multi-disciplinary tools can be designed and adapted to specific areas with a reasonable effort. For small groups with interesting datasets but not much time or funding for data curation, we have to focus on engaging researchers in the process of preparing data for publication, while providing them with measurable outputs. In larger groups, solutions have to be customized to satisfy the requirements of more specific research contexts. We describe our experience at the University of Porto in two lines of enquiry. For the work with long-tail groups we propose general-purpose tools for data description and the interface to multi-disciplinary data repositories. For areas with larger projects and more specific requirements, namely wind infrastructure, sensor data from concrete structures and marine data, we define specialized workflows. In both cases, we present a preliminary evaluation of results and an estimate of the kind of effort required to keep the proposed infrastructures running.  The tools available to researchers can be decisive for their commitment. We focus on data preparation, namely on dataset organization and metadata creation. For groups in the long tail, we propose Dendro, an open-source research data management platform, and explore automatic metadata creation with LabTablet, an electronic laboratory notebook. For groups demanding a domain-specific approach, our analysis has resulted in the development of models and applications to organize the data and support some of their use cases. Overall, we have adopted ontologies for metadata modeling, keeping in sight metadata dissemination as Linked Open Data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信