{"title":"Why the relational data model matters for climate data management","authors":"Ezequiel Cimadevilla","doi":"10.1016/j.cageo.2025.105931","DOIUrl":null,"url":null,"abstract":"<div><div>Efficient data management of climate data banks, in particular those generated by Global or Regional Climate Models, is an important requirement for precise understanding of current changes in the climate system. Current data management practices in the climate community are based on the analysis of binary files for storage of multidimensional arrays that require ad hoc software libraries for accessing the data. Several approaches are being developed to ease and facilitate climate data management and data analysis. However, the theoretical foundations that cause climate data manipulation difficulties remain unchallenged. The Relational Data Model was proposed as a formal solution for database management based on mathematical logic. It has been widely accepted in the industry and has survived the test of time. However, the foundational principles of the Relational Data Model have been overlooked by the climate data management community, mostly due to a lack of emphasis in the relevance of mathematical logic for database management and misunderstanding between physical and logical levels of abstraction. As a result, climate data management workflows lack the rigor and formality provided by the Relational Data Model. This work explains the Relational Data Model at the logical level of abstraction and provides the arguments, clarifies the misconceptions, and justifies its adoption for climate data management in the context of gridded data generated by climate models.</div></div>","PeriodicalId":55221,"journal":{"name":"Computers & Geosciences","volume":"201 ","pages":"Article 105931"},"PeriodicalIF":4.2000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Geosciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098300425000810","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Efficient data management of climate data banks, in particular those generated by Global or Regional Climate Models, is an important requirement for precise understanding of current changes in the climate system. Current data management practices in the climate community are based on the analysis of binary files for storage of multidimensional arrays that require ad hoc software libraries for accessing the data. Several approaches are being developed to ease and facilitate climate data management and data analysis. However, the theoretical foundations that cause climate data manipulation difficulties remain unchallenged. The Relational Data Model was proposed as a formal solution for database management based on mathematical logic. It has been widely accepted in the industry and has survived the test of time. However, the foundational principles of the Relational Data Model have been overlooked by the climate data management community, mostly due to a lack of emphasis in the relevance of mathematical logic for database management and misunderstanding between physical and logical levels of abstraction. As a result, climate data management workflows lack the rigor and formality provided by the Relational Data Model. This work explains the Relational Data Model at the logical level of abstraction and provides the arguments, clarifies the misconceptions, and justifies its adoption for climate data management in the context of gridded data generated by climate models.
期刊介绍:
Computers & Geosciences publishes high impact, original research at the interface between Computer Sciences and Geosciences. Publications should apply modern computer science paradigms, whether computational or informatics-based, to address problems in the geosciences.