Swati Gehlot, Karsten Peters-von Gehlen, Andrea Lammert, Hannes Thiemann
{"title":"Data Management for PalMod-II – A FAIR-Based Strategy for Data Handling in Large Climate Modeling Projects","authors":"Swati Gehlot, Karsten Peters-von Gehlen, Andrea Lammert, Hannes Thiemann","doi":"10.5334/dsj-2023-034","DOIUrl":null,"url":null,"abstract":"PalMod-II was a multi-institutional research project in Germany focusing on enabling and performing global numerical climate simulations with state-of-theart coupled Earth System Models spanning a full glacial cycle from 130 000 years in the past to the present and beyond. The main project goal was the dataset resulting from these simulations and making it available for reuse by the climate science community in-line with the FAIR data principles. In this paper, we present the research data management (RDM) approach developed and employed in PalMod-II to progress towards that project goal. The RDM approach was implemented by RDM professionals specifically funded by PalMod-II, which made it possible to provide RDM services tailored specifically to the project needs. The compilation and maintenance of a project-wide data management plan (DMP) has proven essential for keeping the project on track and serving as a central focal point of any data-related aspects. These include the specification of data responsible scientists, allocation of storage and computaional resources on a high-performance computing system, documentation of simulation output requirements, definition of data standardisation, and publication workflows in-line with the FAIR data principles. Since the RDM approach executed in PalMod-II was first-of-its-kind for all project partners, exhaustive communication at par with the scientists was required to create trust and a collaborative atmosphere within the project. Finally, the RDM approach implemented in PalMod-II facilitated the publication of a flagship dataset for global reuse, and will also be implemented in the follow-up project: PalMod-III.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Science Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5334/dsj-2023-034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
PalMod-II was a multi-institutional research project in Germany focusing on enabling and performing global numerical climate simulations with state-of-theart coupled Earth System Models spanning a full glacial cycle from 130 000 years in the past to the present and beyond. The main project goal was the dataset resulting from these simulations and making it available for reuse by the climate science community in-line with the FAIR data principles. In this paper, we present the research data management (RDM) approach developed and employed in PalMod-II to progress towards that project goal. The RDM approach was implemented by RDM professionals specifically funded by PalMod-II, which made it possible to provide RDM services tailored specifically to the project needs. The compilation and maintenance of a project-wide data management plan (DMP) has proven essential for keeping the project on track and serving as a central focal point of any data-related aspects. These include the specification of data responsible scientists, allocation of storage and computaional resources on a high-performance computing system, documentation of simulation output requirements, definition of data standardisation, and publication workflows in-line with the FAIR data principles. Since the RDM approach executed in PalMod-II was first-of-its-kind for all project partners, exhaustive communication at par with the scientists was required to create trust and a collaborative atmosphere within the project. Finally, the RDM approach implemented in PalMod-II facilitated the publication of a flagship dataset for global reuse, and will also be implemented in the follow-up project: PalMod-III.
期刊介绍:
The Data Science Journal is a peer-reviewed electronic journal publishing papers on the management of data and databases in Science and Technology. Details can be found in the prospectus. The scope of the journal includes descriptions of data systems, their publication on the internet, applications and legal issues. All of the Sciences are covered, including the Physical Sciences, Engineering, the Geosciences and the Biosciences, along with Agriculture and the Medical Science. The journal publishes papers about data and data systems; it does not publish data or data compilations. However it may publish papers about methods of data compilation or analysis.