Joanna Plenzler, Tomasz Budzik, Kornelia Anna Wójcik-Długoborska, Robert Józef Bialik
{"title":"Daily Weather Data From Central and Eastern King George Island (West Antarctica) for 2018–2023","authors":"Joanna Plenzler, Tomasz Budzik, Kornelia Anna Wójcik-Długoborska, Robert Józef Bialik","doi":"10.1002/gdj3.287","DOIUrl":"https://doi.org/10.1002/gdj3.287","url":null,"abstract":"<p>The dataset presented in the paper contains meteorological data from four automatic weather stations (AWS) located in the central and western parts of King George Island (near Arctowski Station and Cape Lions Rump). The dataset includes daily mean, maximum and minimum values of air temperature, relative air humidity, air pressure, wind speed and daily sum of solar radiation. The measurement period ran from 2018.01.01 to 2023.12.31, but it is shorter for two of the stations. Mean values were calculated from measurements taken every 10 min. Direct measurements were used to identify extreme values. The described dataset consists offour files, each for one AWS. It is available in the PANGEA online repository under a non-restrictive CC BY 4.0 licence for anyone after registration. Despite a strong correlation between the daily mean values of the parameters measured at certain stations, some differences between them were also noticeable. These were due to their location at different altitudes, in a place open to the sea or in a shaded place. Generally, values of wind speed, air humidity, solar radiation and pressure are similar to Arctowski during 2013–2017. The only notable distinction is that the mean annual air temperature and the mean air temperature in the winter months were higher than during 1977–1999 and 2013–2017. The data presented can be used as background for other research projects on King George Island, as well as for analysis of the meteorological conditions themselves. They may also be useful for the evaluation of the management plans of the eight Antarctic Specially Protected Areas or Antarctic Specially Managed Area no. 1 that are located on King George Island.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.287","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Wang, Justin E. Stopa, Doug Vandemark, Ralph Foster, Alex Ayet, Alexis Mouche, Bertrand Chapron, Peter Sadowski
{"title":"A multi-tagged SAR ocean image dataset identifying atmospheric boundary layer structure in winter tradewind conditions","authors":"Chen Wang, Justin E. Stopa, Doug Vandemark, Ralph Foster, Alex Ayet, Alexis Mouche, Bertrand Chapron, Peter Sadowski","doi":"10.1002/gdj3.282","DOIUrl":"https://doi.org/10.1002/gdj3.282","url":null,"abstract":"<p>A dataset of multi-tagged sea surface roughness synthetic aperture radar (SAR) satellite images was established near Barbados from January to June 2016 to 2019. It is an advancement of the Sentinel-1 Wave Mode TenGeoP-SARwv (a labelled SAR imagery dataset of 10 geophysical phenomena from Sentinel-1 wave mode) dataset that targets SAR marine atmospheric boundary layer (MABL) coherent structures. Twelve tags define roll vortices, convective cells, mixed rolls and convective cells, fronts, rain cells, cold pools and low winds. Examples are provided for each signature. The final dataset is comprised of 2100 Sentinel-1 wave mode SAR images acquired at 36 incidence angle over an 8° × 8°region centered at 51° W, 15° N. Each image is tagged with one or multiple phenomena by five experts. This strategy extends the TenGeoP-SARwv by identifying coexisting phenomena within a single SAR image and by the addition of mixed roll/cell states and cold pools. The dataset includes PNG-formatted SAR image files along with two text files containing the file name, the central latitude/longitude, expert tags for each image, and all dataset metadata. There is a high degree of consensus among expert tags. The dataset complements existing hand-labelled ocean SAR image datasets and offers the potential for new deep-learning SAR image classification model developments. Future use is also expected to yield new insights into the tradewind MABL processes such as structure transitions and their relation to the stratification.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":"1-14"},"PeriodicalIF":3.3,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.282","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142860440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James Finnis, Helen C. Miles, Ariel Ladegaard, Matt Gunn
{"title":"PCOT: An open-source toolkit for multispectral image processing","authors":"James Finnis, Helen C. Miles, Ariel Ladegaard, Matt Gunn","doi":"10.1002/gdj3.283","DOIUrl":"https://doi.org/10.1002/gdj3.283","url":null,"abstract":"<p>PCOT is a Python program and library which allows users to manipulate multispectral images and associated data. It is in active development in support of the ExoMars mission and intended to be used on data from the Rosalind Franklin rover, but it has much greater potential for use beyond this specific context. PCOT operates on a graph model – the data are processed through a set of nodes which manipulate it in various ways (e.g. add regions of interest, perform maths, splice images together, merge image channels, plot spectra). A PCOT document describes this graph, and we intend that documents are distributed along with the data they generate to help reproducibility. PCOT is open-source, and contributions can be made to the core software, as plugins, or by using PCOT as a library in your own code.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.283","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142764322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Time-domain spectra of ultrasonic wave transmitted through granite and gypsum samples containing artificial defects","authors":"Zhuoran Tian, Chunjiang Zou, Yun Wu","doi":"10.1002/gdj3.281","DOIUrl":"https://doi.org/10.1002/gdj3.281","url":null,"abstract":"<p>The internal defects in rock masses can significantly impact the quality and safety of geotechnical projects. Mechanical waves, as a common nondestructive testing (NDT) method, can reflect the external and internal structures of rock or rock masses. Analyses on the reflected and transmitted waves enable nondestructive identification and assessment of potential defects within rocks. Previous studies mainly focused on the variation of single or limited wave features like main frequency, amplitude and energy between the intact and non-intact samples. In fact, most information contained in the waveforms is neglected. Techniques of data mining can provide a powerful tool to reveal this information and therefore a more accurate determination of the internal structures. In this study, 995,412 NDT data from 14 types of granite and gypsum samples with different cross-section shapes and different types of defects are recorded by an ultrasonic wave generation and collection system. This dataset can be used not only as the training data for defect classification in NDT but also as a good reference for conventional NDT analyses. Besides, time-series data analysis is an opportunity and challenging issue, this dataset holds great potential for broader application in general time-series classification analysis.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.281","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The performance of a high-resolution satellite-derived precipitation product over the topographically complex landscape of Eswatini","authors":"Wisdom M. D. Dlamini, Samkele S. Tfwala","doi":"10.1002/gdj3.278","DOIUrl":"https://doi.org/10.1002/gdj3.278","url":null,"abstract":"<p>The study evaluated the use of Climate Hazard Group InfraRed Precipitation with Stations (CHIRPS) data for monitoring rainfall data in Eswatini. Various statistical metrics such as Bias, correlation coefficient (<i>r</i>), mean absolute error (MAE) and root mean square error (RMSE) were used to evaluate the CHIRPS 2.0 data against 14 rain gauge observations acquired during 1981–2020. CHIRPS 2.0 rainfall agrees well with rain gauge precipitation at monthly (<i>r</i> = 0.73, Bias = 1.02, RMSE = 50.44 and MAD = 31.44), seasonal (<i>r</i> = 0.77, Bias = 1.01, RMSE = 36.99 and MAD = 24.15) and annual scales (<i>r</i> = 0.65, Bias = 2.46, RMSE = 500.78 and MAD = 468.06). Moreover, areas characterized by complex topography and land use, and areas in transition zones (to a different agroecological zone) had generally poor correlations. Nonetheless, CHIRPS 2.0 captures well the spatial distribution of rainfall in the different agroecological zones of Eswatini, even in areas with no rain gauge data. In conclusion, CHIRPS 2.0 can be a very valuable tool in filling gaps created by poor spatial coverage of ground-based rain gauges, especially in the developing world where this is often the case.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.278","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143118716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yifan Sun, Dongsheng Liu, Long Xie, Zheng Gao, Qi Zhang, Luqi Wang, Sen Li
{"title":"A georeferenced dataset of heavy metals occurrence in the soils of the Yangtze River Basin, China","authors":"Yifan Sun, Dongsheng Liu, Long Xie, Zheng Gao, Qi Zhang, Luqi Wang, Sen Li","doi":"10.1002/gdj3.280","DOIUrl":"https://doi.org/10.1002/gdj3.280","url":null,"abstract":"<p>Understanding the fine-scale spatial distribution of heavy metal contamination is crucial for effective environmental capacity control and targeted treatment of polluted areas. This article presents the latest dataset on the occurrence of common heavy metals in the soils of the Yangtze River Basin. The dataset was compiled by reviewing peer-reviewed literature published between 2000 and 2020. Rigorous quality control procedures were employed to ensure the accuracy of the data, including the extraction of detailed geographic locations and concentrations of heavy metals. The dataset includes 7867 records of heavy metal occurrences (Zn: 1045, Cu: 1140, Pb: 1261, Cr: 980, Cd: 1242, Ni: 649, As: 821, Hg: 729) in the soils of the Yangtze River Basin, distributed at four scale levels: province, prefecture, county, and township or finer. The results indicate that the distribution of heavy metal concentrations is relatively scattered, with higher concentrations in cities and regions with developed industry and agriculture. Cd has the highest exceedance rate (33.90%), indicating significant local contamination. Heavy metals, such as Zn at 11.96%, Ni at 12.63%, and As at 9.74%, also exceeded standard levels at certain sampling points. Cr had the lowest exceedance rate of 1.33%. This updated dataset provides essential information on the current status of heavy metals contamination in the soils of the Yangtze River Basin. It can be used for further ecological and health risk assessments and for developing strategies to remediate and prevent heavy metal contamination in the region.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.280","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143118669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyi Zhao, Bin Hu, Chao Ma, Shijun Jiang, Yi Zhang, Xin Li, Lirong Chen, Can Cai, Longgang Ye, Shengjian Zhou, Chengshan Wang
{"title":"A practical approach to building a calcareous nannofossil knowledge graph","authors":"Hongyi Zhao, Bin Hu, Chao Ma, Shijun Jiang, Yi Zhang, Xin Li, Lirong Chen, Can Cai, Longgang Ye, Shengjian Zhou, Chengshan Wang","doi":"10.1002/gdj3.279","DOIUrl":"https://doi.org/10.1002/gdj3.279","url":null,"abstract":"<p>Following sustained development, numerous palaeontology databases and datasets of various types have been created. However, the lack of a unified standard language to describe knowledge and unclear sharing mechanisms between different databases and datasets has limited the large-scale integration and application of paleontological data. The knowledge graph, as a key technology for semantic translation and data fusion, offers a possible solution to these challenges. Given the potential of knowledge graphs to overcome these obstacles, this paper presents a practical approach to express paleontological knowledge in a knowledge graph via the resource description framework language. By delving into the structured data associated with calcareous nannofossil biozones (the UC zone, CC zone and NC zone), we propose an ontology to describe the semantic units and logical relationships of paleontological biozones and species and then integrate relevant species records from unstructured research reports to construct a knowledge graph for calcareous nannofossils, that integrates multisource paleobiological data and knowledge reconstruction. Our focus lies in detailing the technical aspects of constructing a paleontological knowledge graph. The results demonstrate that knowledge graphs can integrate semistructured and unstructured paleontological data from various sources. This work aims to assist palaeontologists in building and utilizing knowledge graphs, serving as an initial effort for future paleontological knowledge reasoning.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.279","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143113856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Jalisco's water quality: A comprehensive web tool for limnological and phytoplankton data","authors":"Cristofer Camarena-Orozco, Eduardo Juárez Carrillo, Martha Alicia Lara González, Edlin Guerra-Castro","doi":"10.1002/gdj3.277","DOIUrl":"https://doi.org/10.1002/gdj3.277","url":null,"abstract":"<p>This study presents a comprehensive dataset of hydrological information gathered from five key eastern basins in Jalisco, Mexico. The dataset encompasses approximately 50 limnological variables and phytoplankton counts specifically for one of these basins. Water-quality data were collected by the State Water Commission of Jalisco, adhering to the methods outlined in the Official Mexican Norm ‘NOM-127’. Monthly samplings were conducted to assess environmental variables such as pH, temperature, oxygen, nutrients and heavy metals. Monitoring has been ongoing for three basins since 2009, while the remaining two basins have been monitored since 2015 and 2020. Phytoplankton data were obtained from monthly samples taken by the University of Guadalajara between 2014 and 2019 in Lake Cajititlán. The original data were cleaned and organized using tidy data principles, with codes accessible on GitHub. To facilitate data exploration and visualization, we developed a user-friendly web application with the Shiny package in R. This application enables users to explore the dataset through summary statistics tables, time series plots and phytoplankton community analysis. The dataset is accessible on Zenodo. The presented data hold significance for environmental and water-quality assessment and applications in machine learning, neural network models, community ecology and broader environmental research. Notably, the raw data, publicly accessible from the State Water Commission of Jalisco, have been previously utilized for these purposes. This dataset offers value due to its diverse limnological and phytoplankton variables, an extended time frame of availability, a curated and streamlined accessibility process and the inclusion of a web application for intuitive exploration and visualization.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"11 4","pages":"495-503"},"PeriodicalIF":3.3,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.277","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142435110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HSPEI: A 1-km spatial resolution SPEI dataset across the Chinese mainland from 2001 to 2022","authors":"Haoming Xia, Yintao Sha, Xiaoyang Zhao, Wenzhe Jiao, Hongquan Song, Jia Yang, Wei Zhao, Yaochen Qin","doi":"10.1002/gdj3.276","DOIUrl":"https://doi.org/10.1002/gdj3.276","url":null,"abstract":"<p>The Standardized Precipitation Evapotranspiration Index (SPEI) is a widely recognized and effective tool for monitoring meteorological droughts. However, existing SPEI datasets suffer from spatial discontinuity or coarse spatial resolution problems, which limits their applications at the local level for drought monitoring research. Therefore, we calculated the SPEI index at meteorological stations, combined with the Global Precipitation Measurement (GPM) Precipitation (Pre), Moderate Resolution Imaging Spectroradiometer (MODIS) Land Surface Temperature (LST), ERA5-Land Shortwave Radiation (SR), Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) datasets and Random Forest Regression (RFR) model, developed a high spatial resolution (1 km) SPEI (HSPEI) datasets with multiple time scales in mainland China from 2001 to 2022. Compared to other SPEI datasets, the HSPEI datasets have higher spatial resolution and can effectively identify the detailed characteristics of drought in mainland China from 2001 to 2022. Overall, the HSPEI datasets can be effectively applied to the research of different droughts in China from 2001 to 2022.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"11 4","pages":"479-494"},"PeriodicalIF":3.3,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.276","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142435086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automation of historical weather data rescue","authors":"Y. Zhang, R. E. Sieber","doi":"10.1002/gdj3.261","DOIUrl":"https://doi.org/10.1002/gdj3.261","url":null,"abstract":"<p>Data rescuers worldwide have been trying to retrieve millions of valuable weather historical records so the observations contained in those records are preserved, searchable, analysable and machine readable. The majority of the records are written by hand, in print or cursive handwriting. Automatic transcriptions to date have not been reliable or sufficiently accurate on handwritten data so most of the historical records are transcribed manually. Recent attempts integrate artificial intelligence (AI) to automatically transcribe the historical records but the results have not been promising. Currently there is no end-to-end workflow to automatically transcribe historical handwritten tabular records into digital datasets. We propose a workflow that uses AI to automate the handwriting transcription process. The workflow is tested using the historical climate records from the Data Rescue: Archives and Weather (DRAW) project. This workflow is composed of five steps: (1) image pre-processing, (2) text line segmentation, (3) bounding boxes detection, (4) AI-enabled optical character recognition (OCR) and (5) layout re-arrangement. These steps are modular to better accommodate future advances (e.g., new image training data, better layout detectors). We hope the workflow proposed can serve as a guideline that is easily replicable and can be utilized to transcribe other historical datasets.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.261","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143119799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}