Scientific DataPub Date : 2025-10-08DOI: 10.1038/s41597-025-05916-7
Jing Fang, Mei Wu, Xinyi Zhang, Yang Chen, Lingjuan Tou, Jinjing Xu, Xiangjun Kong, Yingxiong Qiu
{"title":"Chromosomal-level genome assembly of the autotetraploid Anoectochilus roxburghii (Jinxianlian, Orchidaceae).","authors":"Jing Fang, Mei Wu, Xinyi Zhang, Yang Chen, Lingjuan Tou, Jinjing Xu, Xiangjun Kong, Yingxiong Qiu","doi":"10.1038/s41597-025-05916-7","DOIUrl":"10.1038/s41597-025-05916-7","url":null,"abstract":"<p><p>Anoectochilus roxburghii (Orchidaceae), commonly known as Jinxianlian, is a highly valued traditional Chinese herbal medicine. Here, we present a high-quality chromosome-level genome assembly of the cultivar 'Jinkang No.1'. Genome survey analyses suggest that the cultivar is an autotetraploid. The assembled genome spans 5.17 Gb, with a contig N50 of 24.90 Mb, and 93.4% (4.82 Gb) is anchored onto 80 pseudo-chromosomes across 20 homologous groups. Annotation of the genome assembly identifies 76.42% repetitive elements and 88,106 protein-coding genes, with 94.6% of these genes functionally annotated. Ks analysis of collinear gene pairs indicates a recent species-specific polyploidization event, likely resulting in autotetraploidization. This reference genome provides a valuable resource for functional genomics, evolutionary biology, and molecular breeding studies of A. roxburghii and its related species.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1623"},"PeriodicalIF":6.9,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12508217/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145252408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-08DOI: 10.1038/s41597-025-06052-y
Boris Thome, Friederike Hertweck, Serife Yasar, Lukas Jonas, Stefan Conrad
{"title":"A dataset of study program availability in German higher education between 1971 and 1996.","authors":"Boris Thome, Friederike Hertweck, Serife Yasar, Lukas Jonas, Stefan Conrad","doi":"10.1038/s41597-025-06052-y","DOIUrl":"10.1038/s41597-025-06052-y","url":null,"abstract":"<p><p>Educational systems are dynamic. They shape human capital, technological and societal progress, and also economic growth. Higher education, in particular, fosters innovation, with varying fields of study contributing differently to this process. Yet, despite its importance, no dataset has previously documented the evolution of academic fields across higher education institutions in a specific country. Addressing this gap, we present the RWI-UNI-SUBJECTS<sup>1</sup> dataset, the first extensive collection of study opportunities across German higher education institutions between 1971 and 1996. The dataset originates from annual study guides by the German Federal Employment Agency for high school students. To extract the data, a custom-developed computer vision algorithm was used. We further enriched the dataset with administrative codes for fields, institutions, and districts, enabling seamless integration with additional datasets, such as social security data, official student statistics, or the National Educational Panel Study (NEPS). Covering a total of 105,307 study programs between 1971 and 1996, RWI-UNI-SUBJECTS<sup>1</sup> offers a valuable foundation for interdisciplinary research on education, innovation, and economic development.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1626"},"PeriodicalIF":6.9,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12508074/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145252445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-08DOI: 10.1038/s41597-025-05912-x
Bjarne Steffen, Florian Egli, Anurag Gumber, Mak Ðukan, Paul Waidelich
{"title":"A global dataset of the cost of capital for renewable energy projects.","authors":"Bjarne Steffen, Florian Egli, Anurag Gumber, Mak Ðukan, Paul Waidelich","doi":"10.1038/s41597-025-05912-x","DOIUrl":"10.1038/s41597-025-05912-x","url":null,"abstract":"<p><p>The cost of capital (CoC) critically influences the levelized cost of renewable energy and, by extension, the global low-carbon transition. However, reliable and consistent CoC data remain scarce, limiting an appropriate reflection of CoC differences in energy system and integrated assessment models. We present a global dataset of CoC for renewable energy projects, covering 68 countries from 2010 to 2022 and focusing on three key technologies: utility-scale solar photovoltaics, onshore wind, and offshore wind. We systematically compile and standardize data from academic literature and international organizations, ensuring methodological comparability. Our dataset includes 1,429 data points, of which 366 provide nominal, after-tax weighted average cost of capital values. We conduct technical validation through cross-technology comparisons, temporal consistency checks, and source triangulation. By addressing a key data gap, this dataset aims to support evidence-based energy policy analysis and advance the understanding of how financing conditions impact renewable energy costs globally.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1624"},"PeriodicalIF":6.9,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12508428/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145252423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-08DOI: 10.1038/s41597-025-05993-8
Julius Wörner, Jonas Eimler, Miriam Pein-Hackelbusch
{"title":"Long-term drift behavior in metal oxide gas sensor arrays: a one-year dataset from an electronic nose.","authors":"Julius Wörner, Jonas Eimler, Miriam Pein-Hackelbusch","doi":"10.1038/s41597-025-05993-8","DOIUrl":"10.1038/s41597-025-05993-8","url":null,"abstract":"<p><p>Although electronic nose technology has been studied for years, drift effects remain one of the major challenges. While ongoing research focuses on effective correction methods, the evaluation of these methods requires reliable and well-documented datasets. However, only a few drift datasets are available, some of which lack sufficient experimental detail or are outdated. This motivated us to introduce a new long-term drift dataset. It has been collected over 12 months using a commercial electronic nose, which is based on 62-metal oxide sensors. The measurements were conducted under controlled experimental conditions with three analytes (diacetyl, 2-phenylethanol, and ethanol) in different concentrations. The dataset consists of 700 time-series recordings, for which we provide both the raw data and a set of pre-extracted features. The data can support the development, evaluation, and comparison of methods for feature extraction and selection, as well as drift detection and compensation. By providing a comprehensive, well-documented dataset, we aim to advance research on sensor drift in electronic nose systems.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1628"},"PeriodicalIF":6.9,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12508210/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145252527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-08DOI: 10.1038/s41597-025-05911-y
Chi Hsiang Huang, Shuai Zhang, Deep Shah, Anshul Yadav, Yao Li, Gang Zhao, Huilin Gao
{"title":"3D-LAKES: Three-Dimensional Global Lake and Reservoir Bathymetry from ICESat-2 Altimetry and Landsat Imagery.","authors":"Chi Hsiang Huang, Shuai Zhang, Deep Shah, Anshul Yadav, Yao Li, Gang Zhao, Huilin Gao","doi":"10.1038/s41597-025-05911-y","DOIUrl":"10.1038/s41597-025-05911-y","url":null,"abstract":"<p><p>Quantification of the water storage dynamics in global lakes and reservoirs is pivotal for understanding the roles of surface water in regional climatology, mitigating natural disasters, and preserving ecosystems. However, the ability to accurately comprehend these storage dynamics is significantly hindered by the lack of reliable and cost-effective global bathymetry information. This study introduces the 3D-LAKES dataset, which contains the area-elevation (A-E) relationship and three-dimensional (3D) bathymetry information for 510,530 global lakes and reservoirs, representing 98.9% of global surface water storage capacity. This dataset was validated using 214 A-E relationships and 12 bathymetry maps collected from in-situ measurements, showing strong agreement. The A-E relationships yield an RMSE of 0.60 m, a NRMSE of 0.14, and an R<sup>2</sup> of 0.61; while the 3D bathymetry maps have an RMSE of 1.37 m and a NRMSE of 0.26. This dataset has the potential to support many applications, from monitoring lake/reservoir storage variations to parameterizing hydraulic/hydrological models. Such integration provides essential information for global hydrological studies, water management programs, and disaster mitigation.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1625"},"PeriodicalIF":6.9,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12508203/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145252418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-07DOI: 10.1038/s41597-025-05919-4
Jingdong Cao, Rentao Liao, Guorun Fu, Yunbin Li, Jiaxin Zhang, Yunxia Chai, Gang Chen, Zichao Mao
{"title":"Chromosome-level genome assembly of predatory Eocanthecona furcellata.","authors":"Jingdong Cao, Rentao Liao, Guorun Fu, Yunbin Li, Jiaxin Zhang, Yunxia Chai, Gang Chen, Zichao Mao","doi":"10.1038/s41597-025-05919-4","DOIUrl":"10.1038/s41597-025-05919-4","url":null,"abstract":"<p><p>Eocanthecona furcellata (Wolff, 1811) (Hemiptera: Pentatomidae: Asopinae), a predatory stinkbug, preys on diverse notorious insect pests, positioning it as a critical biocontrol agent in agricultural and forestry ecosystems with substantial regional application and promotion potential. We present a chromosome-level genome assembly of E. furcellata using PacBio HiFi long-read sequencing and Hi-C scaffolding technologies. The final genome spans 1.03 Gb, with a contig N50 of 2.35 Mb and a scaffold N50 of 135.69 Mb, achieving 97.0% BUSCO completeness with the Hemiptera_odb10 lineage dataset. Hi-C data resolved the assembly into 7 pseudochromosomes, representing the full karyotype. Repeat elements constitute 52.07% of the genome (approximately 520.07 Mb). We annotated 15,802 protein-coding genes, of which 90.72% were informatively assigned. This high-quality genome provides a foundational resource for advancing research on predatory bug biology, including genetic adaptations for digestion, molecular mechanisms of prey detection, and biocontrol optimization through functional genomics.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1621"},"PeriodicalIF":6.9,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12504673/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145245135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-07DOI: 10.1038/s41597-025-05895-9
Gulrez Chahal, Michael P Eichenlaub, Markus Tondl, Michał Pawlak, Monika Mohenska, Lin Grimm, Lauren Bottrell, Mark Drvodelic, Sara Alaei, Jeannette Hallab, Lisa N Waylen, Jose M Polo, Cédric Blanpain, Nathan Palpant, Fernando J Rossello, Minna-Liisa Änkö, Peter D Currie, Benjamin M Hogan, Cecilia Winata, Ekaterina Salimova, Hieu T Nim, Mirana Ramialison
{"title":"Epigenomics and transcriptomics profiles of developing zebrafish heart cells.","authors":"Gulrez Chahal, Michael P Eichenlaub, Markus Tondl, Michał Pawlak, Monika Mohenska, Lin Grimm, Lauren Bottrell, Mark Drvodelic, Sara Alaei, Jeannette Hallab, Lisa N Waylen, Jose M Polo, Cédric Blanpain, Nathan Palpant, Fernando J Rossello, Minna-Liisa Änkö, Peter D Currie, Benjamin M Hogan, Cecilia Winata, Ekaterina Salimova, Hieu T Nim, Mirana Ramialison","doi":"10.1038/s41597-025-05895-9","DOIUrl":"10.1038/s41597-025-05895-9","url":null,"abstract":"<p><p>cis-Regulatory elements (cREs) are essential for the spatio-temporal control of gene expression during development and disease. However, cRE activity is highly dependent on cell and tissue type. The developing heart is composed of several cell-types, predominantly cardiomyocytes. Therefore, cardiomyocyte-specific modelling is required to understand the cis-regulation of the developing heart. Zebrafish are an ideal model to study heart development, as they share several physiological features with the human heart during cardiogenesis. Here, we present a comprehensive cardiomyocyte-specific repertoire of cREs isolated from zebrafish larvae. This data combines in vivo transcriptomics and epigenetic profiling, providing insights into cREs and their associated genes involved in heart development. We further perform transgenic reporter assays for the identified cREs associated with popdc2 and bmp10 genes, validating these genomic regions as cardiac regulatory elements. We share this comprehensive, reproducible cardiomyocyte-specific cREs resource as an interrogable web tool for understanding the epigenetic and transcriptomic mechanisms underlying heart development and emergence of congenital heart defects.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1620"},"PeriodicalIF":6.9,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12504427/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145245051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-07DOI: 10.1038/s41597-025-05917-6
Sparkle L Malone, Grace McLeod, Angel Chen, Mayavati Tupaj
{"title":"Contemporary Fire Regimes of the Subtropical Everglades.","authors":"Sparkle L Malone, Grace McLeod, Angel Chen, Mayavati Tupaj","doi":"10.1038/s41597-025-05917-6","DOIUrl":"10.1038/s41597-025-05917-6","url":null,"abstract":"<p><p>Fire is a fundamental force that shapes ecosystems by influencing vegetation composition, succession, and structural diversity. Fire regimes, defined by fire frequency, intensity, and seasonality, vary across ecosystems and are critical in fire-dependent landscapes. In the Florida Everglades, fire is a key driver of ecological dynamics, interacting with hydrology and the structure of vegetation. This study defines contemporary fire regimes by describing fire patterns from 1978 to 2023, utilizing fire perimeter data from Everglades National Park and Big Cypress National Preserve. Our findings reveal a highly variable annual burned area with a strong increasing trend. Prescribed fires were the foundation of trends in fire activity, as wildfires remained stable over the study period. Across the Everglades, fire return intervals differed between ecosystems, with upland ecosystems experiencing more frequent fires than wetland ecosystems. Our findings highlight the role of fire management in shaping modern fire regimes and underscore the importance of prescribed burns in maintaining ecosystem function and resilience in the Everglades.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1622"},"PeriodicalIF":6.9,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12504416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145245069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-06DOI: 10.1038/s41597-025-05904-x
Rafael Barbizan Sühs, Sílvia R Ziller, Clarissa Alves da Rosa, Patricia B Puechagut, Beloni T P Marterer, Eduardo L H Giehl, Matheus Silva Asth, Carlos H Targino, José Renato Legracie-Jr, Tatiani E Chapla, Rafael Dudeque Zenni
{"title":"Georeferenced database of invasive non-native species occurrences in Brazil.","authors":"Rafael Barbizan Sühs, Sílvia R Ziller, Clarissa Alves da Rosa, Patricia B Puechagut, Beloni T P Marterer, Eduardo L H Giehl, Matheus Silva Asth, Carlos H Targino, José Renato Legracie-Jr, Tatiani E Chapla, Rafael Dudeque Zenni","doi":"10.1038/s41597-025-05904-x","DOIUrl":"10.1038/s41597-025-05904-x","url":null,"abstract":"<p><p>This dataset presents a comprehensive and validated compilation of 187,160 georeferenced records of 489 invasive species of fauna (Animalia), flora (Plantae), and algae (Chromista) across Brazilian terrestrial, freshwater and marine territories, including islands. The data were obtained through consultations with federal environmental agencies, national and international databases, and scientific publications. All records were reviewed and validated by experts through national and state-level consultations conducted between 2021 and 2024. This effort was carried out within the framework of the project Pró-Espécies: Estratégia Nacional para a Conservação de Espécies Ameaçadas, which aimed to support the conservation of biodiversity and the management of invasive non-native species.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1619"},"PeriodicalIF":6.9,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12500944/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145239386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2025-10-06DOI: 10.1038/s41597-025-05517-4
Mainak Chakraborty, Chandan, Sahil Anchal, Bodhibrata Mukhopadhyay, Subrat Kar
{"title":"A Structural Vibration-based Dataset for Human Gait Recognition.","authors":"Mainak Chakraborty, Chandan, Sahil Anchal, Bodhibrata Mukhopadhyay, Subrat Kar","doi":"10.1038/s41597-025-05517-4","DOIUrl":"10.1038/s41597-025-05517-4","url":null,"abstract":"<p><p>We present a dataset designed to advance non-intrusive human gait recognition using structural vibration. Structural vibrations, resulting from the rhythmic impacts of toes and heels on the ground, offer a unique, privacy-preserving gait recognition modality. We curated the largest dataset consisting of structural vibration signals from 100 subjects. Existing datasets in this domain are limited in scope, typically involving around ten participants and offering minimal exploration. To comprehensively investigate this modality, we recorded vibration signals across three distinct floor types-wooden, carpet, and cement-and at three different distances from a geophone sensor (1.5 m, 2.5 m, and 4.0 m), involving 40 and 30 participants, respectively. The dataset also includes video recordings of 15 individuals in an outdoor setting. Moreover, we recorded structural vibration signals of 15 people walking at three different speeds. Alongside the vibration data, we provide physiological details such as participant age, gender, height, and weight. The dataset contains over 96 hours of raw structural vibration data, along with additional interim and processed data. This dataset aims to address long-standing challenges in non-intrusive and privacy-preserving gait recognition, with potential applications in clinical analysis, elderly care and rehabilitation engineering.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1617"},"PeriodicalIF":6.9,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12501299/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145239332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}