Scientific DataPub Date : 2024-11-16DOI: 10.1038/s41597-024-04059-5
Martha Dellar, Gertjan Geerling, Kasper Kok, Peter M van Bodegom, Gerard van der Schrier, Maarten Schrama, Eline Boelee
{"title":"Future land use maps for the Netherlands based on the Dutch One Health Shared Socio-economic Pathways.","authors":"Martha Dellar, Gertjan Geerling, Kasper Kok, Peter M van Bodegom, Gerard van der Schrier, Maarten Schrama, Eline Boelee","doi":"10.1038/s41597-024-04059-5","DOIUrl":"10.1038/s41597-024-04059-5","url":null,"abstract":"<p><p>To enable detailed study of a wide variety of future health challenges, we have created future land use maps for the Netherlands for 2050, based on the Dutch One Health Shared Socio-economic Pathways (SSPs). This was done using the DynaCLUE modelling framework. Future land use is based on altitude, soil properties, groundwater, salinity, flood risk, agricultural land price, distance to transport hubs and climate. We also account for anticipated demand for different land use types, historic land use changes and potential spatial restrictions. These land use maps can be used to model many different health risks to people, animals and the environment, such as disease, water quality and pollution. In addition, the Netherlands can serve as an example for other rapidly urbanising deltas where many of the health risks will be similar.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1237"},"PeriodicalIF":5.8,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569152/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2024-11-16DOI: 10.1038/s41597-024-04079-1
Simonas Kecorius, Leizel Madueño, Mario Lovric, Nikolina Racic, Maximilian Schwarz, Josef Cyrys, Juan Andrés Casquero-Vera, Lucas Alados-Arboledas, Sébastien Conil, Jean Sciare, Jakub Ondracek, Anna Gannet Hallar, Francisco J Gómez-Moreno, Raymond Ellul, Adam Kristensson, Mar Sorribas, Nikolaos Kalivitis, Nikolaos Mihalopoulos, Annette Peters, Maria Gini, Konstantinos Eleftheriadis, Stergios Vratolis, Kim Jeongeun, Wolfram Birmili, Benjamin Bergmans, Nina Nikolova, Adelaide Dinoi, Daniele Contini, Angela Marinoni, Andres Alastuey, Tuukka Petäjä, Sergio Rodriguez, David Picard, Benjamin Brem, Max Priestman, David C Green, David C S Beddows, Roy M Harrison, Colin O'Dowd, Darius Ceburnis, Antti Hyvärinen, Bas Henzing, Suzanne Crumeyrolle, Jean-Philippe Putaud, Paolo Laj, Kay Weinhold, Kristina Plauškaitė, Steigvilė Byčenkienė
{"title":"Atmospheric new particle formation identifier using longitudinal global particle number size distribution data.","authors":"Simonas Kecorius, Leizel Madueño, Mario Lovric, Nikolina Racic, Maximilian Schwarz, Josef Cyrys, Juan Andrés Casquero-Vera, Lucas Alados-Arboledas, Sébastien Conil, Jean Sciare, Jakub Ondracek, Anna Gannet Hallar, Francisco J Gómez-Moreno, Raymond Ellul, Adam Kristensson, Mar Sorribas, Nikolaos Kalivitis, Nikolaos Mihalopoulos, Annette Peters, Maria Gini, Konstantinos Eleftheriadis, Stergios Vratolis, Kim Jeongeun, Wolfram Birmili, Benjamin Bergmans, Nina Nikolova, Adelaide Dinoi, Daniele Contini, Angela Marinoni, Andres Alastuey, Tuukka Petäjä, Sergio Rodriguez, David Picard, Benjamin Brem, Max Priestman, David C Green, David C S Beddows, Roy M Harrison, Colin O'Dowd, Darius Ceburnis, Antti Hyvärinen, Bas Henzing, Suzanne Crumeyrolle, Jean-Philippe Putaud, Paolo Laj, Kay Weinhold, Kristina Plauškaitė, Steigvilė Byčenkienė","doi":"10.1038/s41597-024-04079-1","DOIUrl":"10.1038/s41597-024-04079-1","url":null,"abstract":"<p><p>Atmospheric new particle formation (NPF) is a naturally occurring phenomenon, during which high concentrations of sub-10 nm particles are created through gas to particle conversion. The NPF is observed in multiple environments around the world. Although it has observable influence onto annual total and ultrafine particle number concentrations (PNC and UFP, respectively), only limited epidemiological studies have investigated whether these particles are associated with adverse health effects. One plausible reason for this limitation may be related to the absence of NPF identifiers available in UFP and PNC data sets. Until recently, the regional NPF events were usually identified manually from particle number size distribution contour plots. Identification of NPF across multi-annual and multiple station data sets remained a tedious task. In this work, we introduce a regional NPF identifier, created using an automated, machine learning based algorithm. The regional NPF event tag was created for 65 measurement sites globally, covering the period from 1996 to 2023. The discussed data set can be used in future studies related to regional NPF.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1239"},"PeriodicalIF":5.8,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569151/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2024-11-16DOI: 10.1038/s41597-024-04119-w
Seung Jae Lee, Minjoo Cho, Jinmu Kim, Eunkyung Choi, Soyun Choi, Sangdeok Chung, Jaebong Lee, Jeong-Hoon Kim, Hyun Park
{"title":"Chromosome-level genome assembly and annotation of the Patagonian toothfish Dissostichus eleginoides.","authors":"Seung Jae Lee, Minjoo Cho, Jinmu Kim, Eunkyung Choi, Soyun Choi, Sangdeok Chung, Jaebong Lee, Jeong-Hoon Kim, Hyun Park","doi":"10.1038/s41597-024-04119-w","DOIUrl":"10.1038/s41597-024-04119-w","url":null,"abstract":"<p><p>The Patagonian toothfish (Dissostichus eleginoides) belongs to the Actinopterygii class, and the suborder Notothenioidei, which lives in cold waters in the Southern Hemisphere. We performed assembly and annotation, and we integrated the Illumina short-read sequencing for polishinng, PacBio long-read sequencing for contig-level assembly, and Hi-C sequencing technology to obtain high-quality of chromosome-level genome assembly. The final assembly analysis resulted in a total of 495 scaffolds, a genome size of 844.7 Mbp and an N50 length of 36 Mbp. Among these data, we confirmed 24 scaffolds exceeded 10 Mbp and classified as chromosome-level. The completeness of BUSCO rate was over 97%. A total gene set of 32,224 was identified. Furthermore, we analyzed the presence of AFGP genes, classified into Antarctic and sub-Antarctic categories through phylogenetic analysis. This study provides a useful resource for the genomic analysis of Patagonian toothfish and genetic insights into the comparison with Antarctic fishes.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1240"},"PeriodicalIF":5.8,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569150/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142644766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2024-11-15DOI: 10.1038/s41597-024-04105-2
Yangyang Liang, Huijuan Liu, Wenxuan Lu, Jing Li, Ting Fang, Na Gao, Cheng Chen, Xiuxia Zhao, Kun Yang, Haiyang Liu
{"title":"Chromosome-level genome assembly of the smallscale yellowfin (Plagiognathops microlepis).","authors":"Yangyang Liang, Huijuan Liu, Wenxuan Lu, Jing Li, Ting Fang, Na Gao, Cheng Chen, Xiuxia Zhao, Kun Yang, Haiyang Liu","doi":"10.1038/s41597-024-04105-2","DOIUrl":"10.1038/s41597-024-04105-2","url":null,"abstract":"<p><p>The small-scale yellowfin (Plagiognathops microlepis) is a highly valued species in East Asian aquaculture due to its adaptability and high yield. However, the lack of genomic data has impeded genetic research and breeding efforts. In this study, we utilize PacBio Hifi long-read sequencing and Hi-C technologies to construct a highly detailed genome of P. microlepis at the chromosomal level. The assembly encompasses 976.41 Mb, with an exceptional 99.84% distribution across 24 chromosomes. Notably, the contig N50 was 34.41 Mb and scaffold N50 was 38.38 Mb. The completeness of the P. microlepis genome assembly is underscored by a BUSCO score of 98.08%. A total of 25,389 protein-coding genes were identified, with a BUSCO score of 96.98%, and 99.85% of these genes were functionally annotated. Synteny relationships at the chromosome level with Danio rerio and Chanodichthys erythropterus genomes uncover small-scale chromosomal rearrangements. This high-fidelity genome assembly serves as a pivotal resource for forthcoming endeavors such as the genome structure, functional elements, comparative genomics, and evolutionary characteristics of P. microlepis and its relative species.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1234"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568295/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2024-11-15DOI: 10.1038/s41597-024-04041-1
Ivandro Sanches, Victor V Gomes, Carlos Caetano, Lizeth S B Cabrera, Vinicius H Cene, Thomas Beltrame, Wonkyu Lee, Sanghyun Baek, Otávio A B Penatti
{"title":"MIMIC-BP: A curated dataset for blood pressure estimation.","authors":"Ivandro Sanches, Victor V Gomes, Carlos Caetano, Lizeth S B Cabrera, Vinicius H Cene, Thomas Beltrame, Wonkyu Lee, Sanghyun Baek, Otávio A B Penatti","doi":"10.1038/s41597-024-04041-1","DOIUrl":"10.1038/s41597-024-04041-1","url":null,"abstract":"<p><p>Blood pressure (BP) is one of the most prominent indicators of potential cardiovascular disorders. Traditionally, BP measurement relies on inflatable cuffs, which is inconvenient and limit the acquisition of such important health-related information in general population. Based on large amounts of well-collected and annotated data, deep-learning approaches present a generalization potential that arose as an alternative to enable more pervasive approaches. However, most existing work in this area currently uses datasets with limitations, such as lack of subject identification and severe data imbalance that can result in data leakage and algorithm bias. Thus, to offer a more properly curated source of information, we propose a derivative dataset composed of 380 hours of the most common biomedical signals, including arterial blood pressure, photoplethysmography, and electrocardiogram for 1,524 anonymized subjects, each having 30 segments of 30 seconds of those signals. We also validated the proposed dataset through experiments using state-of-the-art deep-learning methods, as we highlight the importance of standardized benchmarks for calibration-free blood pressure estimation scenarios.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1233"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568151/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2024-11-14DOI: 10.1038/s41597-024-04011-7
Reed Ferber, Allan Brett, Reginaldo K Fukuchi, Blayne Hettinga, Sean T Osis
{"title":"A Biomechanical Dataset of 1,798 Healthy and Injured Subjects During Treadmill Walking and Running.","authors":"Reed Ferber, Allan Brett, Reginaldo K Fukuchi, Blayne Hettinga, Sean T Osis","doi":"10.1038/s41597-024-04011-7","DOIUrl":"10.1038/s41597-024-04011-7","url":null,"abstract":"<p><p>Quantitative biomechanical gait analysis is an important clinical and research tool for injury and disease diagnosis and treatment. However, one major criticism is that gait analysis laboratories largely operate in isolation and there is a lack of benchmark datasets, which can be used to advance research and statistical methodologies. To address this, we present an open biomechanics dataset of n = 1798 healthy and injured, young and older adults during treadmill walking and/or running at a range of gait speeds. The full dataset is available on Figshare+ and data files are contained within a series of zipped folders with folder names representing the subject ID. Each subject ID folder contains walking and/or running data containing raw marker trajectory data along with metadata for each participant. Five tutorials are also provided, demonstrating aspects such as loading data files, sample analyses of discrete variables, and calculating joint angles from code along with covering more complex topics such as principal component analysis for dimensionality reduction, statistical parametric mapping, and conducting unsupervised clustering.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1232"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11564798/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142627209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2024-11-14DOI: 10.1038/s41597-024-04063-9
Kun Cai, Liuyin Guan, Shenshen Li, Shuo Zhang, Yang Liu, Yang Liu
{"title":"Full-coverage estimation of CO<sub>2</sub> concentrations in China via multisource satellite data and Deep Forest model.","authors":"Kun Cai, Liuyin Guan, Shenshen Li, Shuo Zhang, Yang Liu, Yang Liu","doi":"10.1038/s41597-024-04063-9","DOIUrl":"10.1038/s41597-024-04063-9","url":null,"abstract":"<p><p>Monitoring China's carbon dioxide (CO<sub>2</sub>) concentration is essential for formulating effective carbon cycle policies to achieve carbon peaking and neutrality. Despite insufficient satellite observation coverage, this study utilizes high-resolution spatiotemporal data from the Orbiting Carbon Observatory 2 (OCO-2), supplemented with various auxiliary datasets, to estimate full-coverage, monthly, column-averaged carbon dioxide (XCO<sub>2</sub>) values across China from 2015 to 2022 at a spatial resolution of 0.05° via the deep forest model. The 10-fold cross-validation results indicate a correlation coefficient (R) of 0.95 and a determination coefficient (R²) of 0.90. Validation against ground-based station data yielded R values of 0.93, and R² values reached 0.81. Further validation from the Greenhouse Gases Observing Satellite (GOSAT) and the Copernicus Atmosphere Monitoring Service Reanalysis dataset (CAMS) produced R² values of 0.87 and 0.80, respectively. During the study period, CO<sub>2</sub> concentrations in China were higher in spring and winter than in summer and autumn, indicating a clear annual increase. The estimates generated by this study could potentially support CO<sub>2</sub> monitoring in China.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1231"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11564725/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142627262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 4 km daily gridded meteorological dataset for China from 2000 to 2020.","authors":"Jielin Zhang, Bo Liu, Siqing Ren, Wenqi Han, Yongxia Ding, Shouzhang Peng","doi":"10.1038/s41597-024-04029-x","DOIUrl":"10.1038/s41597-024-04029-x","url":null,"abstract":"<p><p>Multi-variate gridded meteorological data with high spatial resolution play a key role in studies related to climate change. This study constructed a 4 km daily gridded meteorological dataset for mainland of China (China Daily Meteorological Dataset; CDMet) from 2000 to 2020. The dataset includes nine meteorological variables: 2-meter air temperature (maximum, minimum, and mean temperatures), total precipitation, skin temperature, 10-meter wind speed, relative humidity, surface pressure, and sunshine duration. CDMet was generated using an adaptive interpolation scheme, which employed thin-plate spline and random forest methods to construct the interpolation model. Six combinations of location and terrain information were designed and used as covariates in the model together with reanalysis data. Validation with independent observation stations and existing datasets showed that CDMet has acceptable accuracy, reasonable seasonal variability, and precise spatial distribution, and its accuracy is comparable to that of other datasets. Due to its comprehensive variables and high resolution, CDMet can be used as input data for hydrological, agricultural, and ecological models.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1230"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11564775/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142627203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2024-11-14DOI: 10.1038/s41597-024-04078-2
Xiao-Yan Meng, Lan Bu, Ling Shen, Kun-Ming Tao
{"title":"A transcriptome data set for comparing skin, muscle and dorsal root ganglion between acute and chronic postsurgical pain rats.","authors":"Xiao-Yan Meng, Lan Bu, Ling Shen, Kun-Ming Tao","doi":"10.1038/s41597-024-04078-2","DOIUrl":"10.1038/s41597-024-04078-2","url":null,"abstract":"<p><p>Chronic postsurgical pain (CPSP), with a high prevalence and rising epidemic of opioids crisis, is typically derived from acute postoperative pain. Our knowledge on the forming of chronic pain mostly derives from mechanistic studies of pain processing in the brain and spinal cord circuits, yet most pharmacological interventions targeting CNS came to be unhelpful in preventing CPSP. Revealing the peripheral mechanisms behind the transition from acute to chronic pain after surgery could shine a light on the novel analgesic regimens. Based on two recognized animal models in simulation of acute and chronic postsurgical pain, we provide a next-generation RNA sequencing (RNA-seq) data set to evaluate the time-course transcriptomic variation in the tissue of skin, muscle and dorsal root ganglion (DRG) in these two pain models. The aim of this study is to identify the potential origin and mechanism of the persistent postoperative pain, and further to explore effective and safer analgesic regimens for surgical patients.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"11 1","pages":"1229"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11564764/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142627241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}