{"title":"Haplotype-resolved genome assembly of a cultivated variety of Zizania latifolia (Zhejiao7).","authors":"Jun-Cheng Wu, Fei Xu, Wan-Long Jiang, Hui-Shan Shao, Jing-Tian Tang, Ya-Fen Zhang, Zi-Hong Ye","doi":"10.1038/s41597-026-07389-8","DOIUrl":"https://doi.org/10.1038/s41597-026-07389-8","url":null,"abstract":"<p><p>The long-term reliance on vegetative propagation combined with continuous infections by Ustilago esculenta, has resulted in substantial genetic alteration in cultivated varieties of Zizania latifolia. In this study, we focus on the 'Zhejiao7' variety, with its characteristic of early maturation in both the summer and autumn seasons. We present the first haplotype-resolved genome assembly of the Jiaobai variety 'Zhejiao7', with the majority of chromosomes sequenced to the T2T level. With the mixed application of PacBio HiFi, Nanopore, Illumina, and Hi-C technologies, we assembled two haplotypes (Hap1: 610.06 Mb, Hap2: 585.52 Mb), each with 17 chromosomes. Functional annotation identified ~78% of the protein-coding genes. This high-quality assembly provides insights into domestication, agronomic traits, and stem expansion, facilitating research on Jiaobai's bioactive compounds and medicinal properties and supporting advancements in agriculture and biomedicine.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147857041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2026-05-08DOI: 10.1038/s41597-026-07419-5
Arturo Villarroya-Carpio, Víctor Cazcarra-Bes, Alejandro Mestre-Quereda, Juan M Lopez-Sanchez
{"title":"A SAR and optical remote sensing dataset in Seville for scientific research in agriculture.","authors":"Arturo Villarroya-Carpio, Víctor Cazcarra-Bes, Alejandro Mestre-Quereda, Juan M Lopez-Sanchez","doi":"10.1038/s41597-026-07419-5","DOIUrl":"https://doi.org/10.1038/s41597-026-07419-5","url":null,"abstract":"<p><p>Radar remote sensing data is underused for agricultural applications due to the complexity of its pre-processing and to the non-obvious physical interpretation of the derived features. To address these challenges, this work presents the SAR and Optical Dataset for Agriculture in Seville (SODAS), which integrates time series of radar images (Sentinel-1), optical images (Sentinel-2), precipitation records, and crop-type maps. The georeferenced images cover an agricultural area in Seville, Spain, from 2017 to 2021. The SAR images are provided in the form of dual-polarimetric covariance matrices, which include the backscattering coefficient and the correlation between channels, and repeat-pass interferometric products (coherence and phase) at VV and VH polarimetric channels. The optical images correspond to reflectivity at red, green, blue, and near infra-red bands, as well as NDVI products. This dataset has many potential uses, such as development of algorithms for crop-type mapping, retrieval of biophysical parameters, crop monitoring, and data fusion. Additionally, a Jupyter notebook to load the dataset, create and compare time series, and visualise images is included.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147856942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2026-05-08DOI: 10.1038/s41597-026-07313-0
Valeria Velásquez-Zapata, Schuyler D Smith, Gregory Fuerst, Roger P Wise
{"title":"Epistatic chromatin remodeling during barley response to powdery mildew by ATAC-Seq.","authors":"Valeria Velásquez-Zapata, Schuyler D Smith, Gregory Fuerst, Roger P Wise","doi":"10.1038/s41597-026-07313-0","DOIUrl":"https://doi.org/10.1038/s41597-026-07313-0","url":null,"abstract":"<p><p>Understanding the molecular basis of plant-pathogen interactions is critical for advancing crop protection strategies. Powdery mildew, caused by the obligate fungal pathogen Blumeria hordei (Bh), is a threat to barley production worldwide. We exploited time-course ATAC-Seq of barley and derived immune mutants infected with Bh to infer chromatin accessibility influenced by the genetic interactions of mildew locus a6 (Mla6), encoding a nucleotide binding leucine-rich repeat (NLR) immune receptor, and Blufensin1 (Bln1), a basal defense regulator. Sampling at 0, 16, 20, and 32 hours after inoculation captured key pathogen developmental stages representing fungal penetration and haustorial development, respectively. Validation of the dataset was accomplished by calculating general ATAC-Seq peak metrics and comparison with paired RNA-Seq data. ATAC-Seq and RNA-Seq results were correlated, highlighting in particular chromatin-mediated epistatic interactions, demonstrating that the dataset could provide insight into regulatory chromatin architecture. These results offer a valuable dataset for dissecting transcriptional networks involved in barley immune responses.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147856979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2026-05-07DOI: 10.1038/s41597-026-07385-y
Yanfu Que, Weitao Li, Junjie Wu, Xingkun Hu, Ezhou Wang, Weiwei Dong, Bin Zhu
{"title":"A chromosome-level whole genome of Red mahseer (Tor sinensis).","authors":"Yanfu Que, Weitao Li, Junjie Wu, Xingkun Hu, Ezhou Wang, Weiwei Dong, Bin Zhu","doi":"10.1038/s41597-026-07385-y","DOIUrl":"https://doi.org/10.1038/s41597-026-07385-y","url":null,"abstract":"<p><p>Genomic resources for Tor species are currently limited, and their molecular systematics remain poorly resolved. In this study, we report the first chromosome-level genome assembly of Tor sinensis, generated using PacBio long-read sequencing and Hi-C scaffolding. The final genome assembly spans approximately 1.80 Gb and consists of 228 contigs, with a contig N50 of 31.67 Mb. A total of 97 contigs were successfully anchored and ordered into 50 chromosomes, covering 96.57% of the genome. Chromosome lengths range from 23.67 Mb to 62.88 Mb. Repeat analysis identified 807.40 Mb of repetitive sequences, representing 44.84% of the genome. In addition, 3,794 rRNA genes and 11,264 tRNA genes were annotated, along with 379 snRNAs and 35 snoRNAs. The average length of tRNA genes was 72 bp. A total of 62,000 protein-coding genes were predicted, yielding 263,453 protein sequences. This high-quality reference genome provides a valuable foundation for future studies and facilitate comparative genomic analyses across the Tor genus and support biodiversity conservation efforts in the Lancang-Mekong River basin.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147842343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2026-05-07DOI: 10.1038/s41597-026-07391-0
Caleb Beckwith, Nikhil Gupta
{"title":"Rheological Measurement Dataset of Resin and Composite Mixtures for Digital Light Processing based Additive Manufacturing.","authors":"Caleb Beckwith, Nikhil Gupta","doi":"10.1038/s41597-026-07391-0","DOIUrl":"https://doi.org/10.1038/s41597-026-07391-0","url":null,"abstract":"<p><p>This work features a dataset intended to aid in the development of processing parameters for particulate composite materials for additive manufacturing (AM) using digital light processing (DLP) method. Hollow glass microspheres (HGMs) are used as particulate fillers to manufacture syntactic foam composites. The standard thermosetting resins have a recommended set of 3D printing parameters. However, when the particles are mixed to 3D print composites, the mixture properties change, and the processing parameters need to be adjusted according to the particle volume fraction, size and other parameters. This dataset provides rheological measurements on the neat resin and resin-HGM composite mixtures intended for vat photopolymerization (VP) based DLP method. Three HGM density grades (0.13, 0.23, and 0.31 g/cm³ true particle densities) were examined at 10, 20, and 40 vol.% to capture the effects of particle density and volume fraction on resin flow behavior. Viscosity was measured using an Anton Paar MCR702 analyzer at a range of shear rates (1-50 s⁻¹) and temperatures (35-125 °C), generating 50 independent data files for 5 trials repeated for each of the 10 compositions. The dataset records the coupled influence of temperature and shear rate for each composition, providing a foundation for modeling viscosity-shear rate-temperature-composition relationships. Interpolations of various parameters at a benchmark viscosity of 0.40 Pa·s enable extraction of processing windows relevant to VP of the composite mixtures and significantly reduce the trial and error involved in processing parameter optimization for composite mixtures. The dataset also includes µCT image stacks for representative specimens, supporting qualitative assessment of particle dispersion and internal structure.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147842389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2026-05-07DOI: 10.1038/s41597-026-07226-y
Aude Rocatcher, Xavier Dieu, Valérie Desquiret-Dumas, Benjamin Billiet, Nicolas Chassaing, Hélène Dollfus, Isabelle Meunier, Marie-Bénédicte Rougier, Yann Polfrit, Guy Lenaers, Dan Milea, Philippe Gohier, Delphine Mirebeau-Prunier, Patrizia Amati-Bonneau, Pascal Reynier, Marc Ferré
{"title":"A dataset of patients with isolated and syndromic optic neuropathies linked to RTN4IP1 genetic variants.","authors":"Aude Rocatcher, Xavier Dieu, Valérie Desquiret-Dumas, Benjamin Billiet, Nicolas Chassaing, Hélène Dollfus, Isabelle Meunier, Marie-Bénédicte Rougier, Yann Polfrit, Guy Lenaers, Dan Milea, Philippe Gohier, Delphine Mirebeau-Prunier, Patrizia Amati-Bonneau, Pascal Reynier, Marc Ferré","doi":"10.1038/s41597-026-07226-y","DOIUrl":"https://doi.org/10.1038/s41597-026-07226-y","url":null,"abstract":"<p><p>Biallelic pathogenic variants of the Reticulon 4 interacting protein 1 (RTN4IP1) gene are responsible for optic atrophy, either isolated or associated with ataxia, mental retardation, and seizures. They are identified as a cause of hereditary optic neuropathy in 7% of patients diagnosed before the age of 20. We have built a dataset for this gene by collating all the clinical cases available in the literature, and unpublished patients diagnosed at our centre, using standard nomenclature to describe both the molecular and phenotypic features. We performed a comprehensive data analysis, based on computational reasoning, to provide an overall picture of the dataset and validate its relevance. This new dataset provides an updated genetic map of the reported pathogenic variants, an ontological annotation of phenotypic abnormalities in a grid format showing clinical heterogeneity, and a full interoperability with the databases of other genetic forms of optic neuropathies.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147842294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2026-05-07DOI: 10.1038/s41597-026-07362-5
Rafael Aguilar-Ortega, Jorge Zafra-Palma, Rafael Muñoz-Salinas, Manuel J Marin-Jimenez
{"title":"UCOPhyRehab++: A multi-modal and multi-view dataset for human rehabilitation analysis.","authors":"Rafael Aguilar-Ortega, Jorge Zafra-Palma, Rafael Muñoz-Salinas, Manuel J Marin-Jimenez","doi":"10.1038/s41597-026-07362-5","DOIUrl":"https://doi.org/10.1038/s41597-026-07362-5","url":null,"abstract":"<p><p>The rehabilitation of patients with musculoskeletal disorders is usually associated with the performance of prescribed exercises at home. Performing these exercises without medical supervision may lead to incorrect execution, resulting in secondary injuries or slower recovery rates for these patients. For this reason, research into assisted rehabilitation methodologies for patients of this type has been one of the most studied fields in recent years. The use of computer vision techniques has rapidly increased in recent literature. However, there is a significant lack of available data for training machine learning models or for testing these systems. In this paper, we extend our previous work, UCOPhyRehab (University of COrdoba Physical Rehabilitation), by adding multiple modalities to the original data, incorporating demographic metadata, and including performance scores assigned by an expert physical therapist. Our validation experiments demonstrate that this new release complements the original UCOPhyRehab data and enables new research directions, such as multi-modal fusion (e.g., combining silhouettes, optical flow, and semantic segmentation) and multi-view fusion across the five camera viewpoints, to improve the robustness and accuracy of rehabilitation-assistance methods.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147842534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2026-05-07DOI: 10.1038/s41597-026-06974-1
Cory O Brant, Sofia Silvis, David H Bennion, Chris Castiglione, Kieran Tyrrell, Karissa Hannahs, Michael Slattery, David Bunnell, Andrew Honsey, Ralph Tingley, Katelyn King, Karen M Alofs, Amanda Ackiss, Charles R Bronte, Jason Smith, Matthew Herbert
{"title":"Two hundred years of historical spawning and nursery data for coregonine fishes in the Laurentian Great Lakes.","authors":"Cory O Brant, Sofia Silvis, David H Bennion, Chris Castiglione, Kieran Tyrrell, Karissa Hannahs, Michael Slattery, David Bunnell, Andrew Honsey, Ralph Tingley, Katelyn King, Karen M Alofs, Amanda Ackiss, Charles R Bronte, Jason Smith, Matthew Herbert","doi":"10.1038/s41597-026-06974-1","DOIUrl":"https://doi.org/10.1038/s41597-026-06974-1","url":null,"abstract":"<p><p>Historical data can provide critical ecological information for species across the globe, many of which are facing unprecedented rates of ecosystem change. Yet, historical information related to freshwater species, especially fishes, remains scattered, often in original formats, and underutilized for informing conservation and restoration activities. Here, we present a Data Descriptor called Coregonine Spawning History (CORHIST), a database designed to house diverse data related to past spawning and nursery areas for fishes in the family Salmonidae, subfamily Coregoninae (ciscoes and whitefishes), in the Laurentian Great Lakes and their tributaries. Data for 11 species of coregonines historically occurring in the Great Lakes are included in CORHIST. Over 3,400 occurrence records at the coordinate scale have been entered, over 2,200 of which are for Cisco (Coregonus artedi) and Lake Whitefish (C. clupeaformis)-two focal species for which there is either multinational conservation interest or restoration efforts underway in the Laurentian Great Lakes. CORHIST is already proving useful for several studies developing habitat suitability models and delineating spatial units for conservation or restoration planning.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"13 1","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147842539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chromosome-level haplotype-resolved genome assembly of kiwifruit (Actinidia eriantha).","authors":"Quan Jiang, Sheng Zhang, Dongmei Tang, Weiming Zhong, Qianming Zheng, Qing Liu, Yong Qi, Binbin Shi, Xiaohong Yao, Jia Zhou","doi":"10.1038/s41597-026-07414-w","DOIUrl":"https://doi.org/10.1038/s41597-026-07414-w","url":null,"abstract":"<p><p>Kiwifruit (Actinidia spp.) is an economically important plant that has undergone rapid development in the 20th century. Among these species, A. eriantha holds considerable agronomic and evolutionary significance. Here, we present a haplotype-resolved, chromosome-level genome assembly for a wild A. eriantha individual. The assembled genome sizes were 663.76 Mb for HAP1 and 633.05 Mb for HAP2, with contig N50 values of 21.82 Mb and 21.39 Mb, respectively. A total of 95.9% of HAP1 (636.55 Mb) and 99.25% of HAP2 (628.28 Mb) sequences were successfully anchored to 29 pseudochromosomes, resulting in scaffold N50 values of 21.89 Mb and 21.50 Mb, respectively. We annotated a total of 39,983 and 40,099 high-confidence protein-coding genes for two haplotypes, respectively. Genome evaluations showed high completeness and accuracy, with BUSCO completeness of 99.5% for both haplotypes as well as QV of 43.11 for HAP1 and 43.42 for HAP2. This high-quality genome provides a valuable reference for comparative and population genomics studies, and will facilitate the identification of genes controlling key agronomic traits, thereby accelerating molecular breeding in kiwifruit.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147842334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scientific DataPub Date : 2026-05-07DOI: 10.1038/s41597-026-07357-2
Thalita Cirino, Giulia Caron, Giuseppe Ermondi, Larysa Charochkina, Igor V Tetko
{"title":"SangsterLogP - the largest publicly available dataset of logP values.","authors":"Thalita Cirino, Giulia Caron, Giuseppe Ermondi, Larysa Charochkina, Igor V Tetko","doi":"10.1038/s41597-026-07357-2","DOIUrl":"https://doi.org/10.1038/s41597-026-07357-2","url":null,"abstract":"<p><p>We present SangsterLogP, the largest publicly available curated dataset of experimental logP values, comprising more than 23k unique molecules, with experimental logP values ranging from -3.8 to 11.7 (about 15.9 log units). The dataset originated from Dr. James Sangster's comprehensive literature review of over 3k sources. We implemented a systematic curation workflow including a) logD-to-logP adjustment for ionised compounds and b) consensus-based residual analysis for outliers and duplicates removal. External validation using retrospective and prospective test sets demonstrated robust predictive performance (RMSE of 0.34 and 0.47 log units, respectively). SangsterLogP also substantially expands coverage of chemical space compared to the widely used legacy PHYSPROP database, including compounds in the beyond-Rule-of-5 domain. The fully annotated dataset, including experimental conditions and sources, is freely accessible via the Zenodo repository and on the Online Chemical database and Modelling Environment website.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":" ","pages":""},"PeriodicalIF":6.9,"publicationDate":"2026-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147842529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}