GigaScience最新文献

筛选
英文 中文
An analysis of performance bottlenecks in MRI preprocessing. MRI预处理中的性能瓶颈分析。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae098
Mathieu Dugré, Yohan Chatelain, Tristan Glatard
{"title":"An analysis of performance bottlenecks in MRI preprocessing.","authors":"Mathieu Dugré, Yohan Chatelain, Tristan Glatard","doi":"10.1093/gigascience/giae098","DOIUrl":"10.1093/gigascience/giae098","url":null,"abstract":"<p><p>Magnetic resonance imaging (MRI) preprocessing is a critical step for neuroimaging analysis. However, the computational cost of MRI preprocessing pipelines is a major bottleneck for large cohort studies and some clinical applications. While high-performance computing and, more recently, deep learning have been adopted to accelerate the computations, these techniques require costly hardware and are not accessible to all researchers. Therefore, it is important to understand the performance bottlenecks of MRI preprocessing pipelines to improve their performance. Using the Intel VTune profiler, we characterized the bottlenecks of several commonly used MRI preprocessing pipelines from the Advanced Normalization Tools (ANTs), FMRIB Software Library, and FreeSurfer toolboxes. We found few functions contributed to most of the CPU time and that linear interpolation was the largest contributor. Data access was also a substantial bottleneck. We identified a bug in the Insight Segmentation and Registration Toolkit library that impacts the performance of the ANTs pipeline in single precision and a potential issue with the OpenMP scaling in FreeSurfer recon-all. Our results provide a reference for future efforts to optimize MRI preprocessing pipelines.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11899568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143614576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Similar, but not the same: multiomics comparison of human valve interstitial cells and osteoblast osteogenic differentiation expanded with an estimation of data-dependent and data-independent PASEF proteomics. 相似,但不相同:人瓣膜间质细胞和成骨细胞成骨分化的多组学比较扩展了对数据依赖和数据独立的PASEF蛋白质组学的估计。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae110
Arseniy Lobov, Polina Kuchur, Nadezhda Boyarskaya, Daria Perepletchikova, Ivan Taraskin, Andrei Ivashkin, Daria Kostina, Irina Khvorova, Vladimir Uspensky, Egor Repkin, Evgeny Denisov, Tatiana Gerashchenko, Rashid Tikhilov, Svetlana Bozhkova, Vitaly Karelkin, Chunli Wang, Kang Xu, Anna Malashicheva
{"title":"Similar, but not the same: multiomics comparison of human valve interstitial cells and osteoblast osteogenic differentiation expanded with an estimation of data-dependent and data-independent PASEF proteomics.","authors":"Arseniy Lobov, Polina Kuchur, Nadezhda Boyarskaya, Daria Perepletchikova, Ivan Taraskin, Andrei Ivashkin, Daria Kostina, Irina Khvorova, Vladimir Uspensky, Egor Repkin, Evgeny Denisov, Tatiana Gerashchenko, Rashid Tikhilov, Svetlana Bozhkova, Vitaly Karelkin, Chunli Wang, Kang Xu, Anna Malashicheva","doi":"10.1093/gigascience/giae110","DOIUrl":"10.1093/gigascience/giae110","url":null,"abstract":"<p><p>Osteogenic differentiation is crucial in normal bone formation and pathological calcification, such as calcific aortic valve disease (CAVD). Understanding the proteomic and transcriptomic landscapes underlying this differentiation can unveil potential therapeutic targets for CAVD. In this study, we employed RNA sequencing transcriptomics and proteomics on a timsTOF Pro platform to explore the multiomics profiles of valve interstitial cells (VICs) and osteoblasts during osteogenic differentiation. For proteomics, we utilized 3 data acquisition/analysis techniques: data-dependent acquisition (DDA)-parallel accumulation serial fragmentation (PASEF) and data-independent acquisition (DIA)-PASEF with a classic library-based (DIA) and machine learning-based library-free search (DIA-ML). Using RNA sequencing data as a biological reference, we compared these 3 analytical techniques in the context of actual biological experiments. We use this comprehensive dataset to reveal distinct proteomic and transcriptomic profiles between VICs and osteoblasts, highlighting specific biological processes in their osteogenic differentiation pathways. The study identified potential therapeutic targets specific for VICs osteogenic differentiation in CAVD, including the MAOA and ERK1/2 pathway. From a technical perspective, we found that DIA-based methods demonstrate even higher superiority against DDA for more sophisticated human primary cell cultures than it was shown before on HeLa samples. While the classic library-based DIA approach has proved to be a gold standard for shotgun proteomics research, the DIA-ML offers significant advantages with a relatively minor compromise in data reliability, making it the method of choice for routine proteomics.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11724719/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143055932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How to select predictive models for decision-making or causal inference. 如何为决策或因果推理选择预测模型。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf016
Matthieu Doutreligne, Gaël Varoquaux
{"title":"How to select predictive models for decision-making or causal inference.","authors":"Matthieu Doutreligne, Gaël Varoquaux","doi":"10.1093/gigascience/giaf016","DOIUrl":"10.1093/gigascience/giaf016","url":null,"abstract":"<p><strong>Background: </strong>We investigate which procedure selects the most trustworthy predictive model to explain the effect of an intervention and support decision-making.</p><p><strong>Methods: </strong>We study a large variety of model selection procedures in practical settings: finite samples settings and without a theoretical assumption of well-specified models. Beyond standard cross-validation or internal validation procedures, we also study elaborate causal risks. These build proxies of the causal error using \"nuisance\" reweighting to compute it on the observed data. We evaluate whether empirically estimated nuisances, which are necessarily noisy, add noise to model selection and compare different metrics for causal model selection in an extensive empirical study based on a simulation and 3 health care datasets based on real covariates.</p><p><strong>Results: </strong>Among all metrics, the mean squared error, classically used to evaluate predictive modes, is worse. Reweighting it with a propensity score does not bring much improvement in most cases. On average, the $Rtext{-risk}$, which uses as nuisances a model of mean outcome and propensity scores, leads to the best performances. Nuisance corrections are best estimated with flexible estimators such as a super learner.</p><p><strong>Conclusions: </strong>When predictive models are used to explain the effect of an intervention, they must be evaluated with different procedures than standard predictive settings, using the $Rtext{-risk}$ from causal inference.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11927402/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143673822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The haplotype-resolved T2T genome for Bauhinia × blakeana sheds light on the genetic basis of flower heterosis. 紫荆T2T基因组的单倍型解析揭示了紫荆花杂种优势的遗传基础。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf044
Weixue Mu, Joshua Casey Darian, Wing-Kin Sung, Xing Guo, Tuo Yang, Mandy Wai Man Tang, Ziqiang Chen, Steve Kwan Hok Tong, Irene Wing Shan Chik, Robert L Davidson, Scott C Edmunds, Tong Wei, Stephen Kwok-Wing Tsui
{"title":"The haplotype-resolved T2T genome for Bauhinia × blakeana sheds light on the genetic basis of flower heterosis.","authors":"Weixue Mu, Joshua Casey Darian, Wing-Kin Sung, Xing Guo, Tuo Yang, Mandy Wai Man Tang, Ziqiang Chen, Steve Kwan Hok Tong, Irene Wing Shan Chik, Robert L Davidson, Scott C Edmunds, Tong Wei, Stephen Kwok-Wing Tsui","doi":"10.1093/gigascience/giaf044","DOIUrl":"https://doi.org/10.1093/gigascience/giaf044","url":null,"abstract":"<p><strong>Background: </strong>The Hong Kong orchid tree Bauhinia × blakeana Dunn has long been proposed to be a sterile interspecific hybrid exhibiting flower heterosis when compared to its likely parental species, Bauhinia purpurea L. and Bauhinia variegata L. Here, we report comparative genomic and transcriptomic analyses of the 3 Bauhinia species.</p><p><strong>Findings: </strong>We generated chromosome-level assemblies for the parental species and applied a trio-binning approach to construct a haplotype-resolved telomere-to-telomere (T2T) genome for B. blakeana. Comparative chloroplast genome analysis confirmed B. purpurea as the maternal parent. Transcriptome profiling of flower tissues highlighted a closer resemblance of B. blakeana to its maternal parent. Differential gene expression analyses revealed distinct expression patterns among the 3 species, particularly in biosynthetic and metabolic processes. To investigate the genetic basis of flower heterosis observed in B. blakeana, we focused on gene expression patterns within pigment biosynthesis-related pathways. High-parent dominance and overdominance expression patterns were observed, particularly in genes associated with carotenoid biosynthesis. Additionally, allele-specific expression analysis revealed a balanced contribution of maternal and paternal alleles in shaping the gene expression patterns in B. blakeana.</p><p><strong>Conclusions: </strong>Our study offers valuable insights into the genome architecture of hybrid B. blakeana, establishing a comprehensive genomic and transcriptomic resource for future functional genetics research within the Bauhinia genus. It also serves as a model for exploring the characteristics of hybrid species using T2T haplotype-resolved genomes, providing a novel approach to understanding genetic interactions and evolutionary mechanisms in complex genomes with high heterozygosity.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12012898/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143964846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overture: an open-source genomics data platform. Overture:一个开源基因组数据平台。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf038
Mitchell Shiell, Rosi Bajari, Dusan Andric, Jon Eubank, Brandon F Chan, Anders J Richardsson, Azher Ali, Bashar Allabadi, Yelizar Alturmessov, Jared Baker, Ann Catton, Kim Cullion, Daniel DeMaria, Patrick Dos Santos, Henrich Feher, Francois Gerthoffert, Minh Ha, Robin A Haw, Atul Kachru, Alexandru Lepsa, Alexis Li, Rakesh N Mistry, Hardeep K Nahal-Bose, Aleksandra Pejovic, Samantha Rich, Leonardo Rivera, Ciarán Schütte, Edmund Su, Robert Tisma, Jaser Uddin, Chang Wang, Alex N Wilmer, Linda Xiang, Junjun Zhang, Lincoln D Stein, Vincent Ferretti, Mélanie Courtot, Christina K Yung
{"title":"Overture: an open-source genomics data platform.","authors":"Mitchell Shiell, Rosi Bajari, Dusan Andric, Jon Eubank, Brandon F Chan, Anders J Richardsson, Azher Ali, Bashar Allabadi, Yelizar Alturmessov, Jared Baker, Ann Catton, Kim Cullion, Daniel DeMaria, Patrick Dos Santos, Henrich Feher, Francois Gerthoffert, Minh Ha, Robin A Haw, Atul Kachru, Alexandru Lepsa, Alexis Li, Rakesh N Mistry, Hardeep K Nahal-Bose, Aleksandra Pejovic, Samantha Rich, Leonardo Rivera, Ciarán Schütte, Edmund Su, Robert Tisma, Jaser Uddin, Chang Wang, Alex N Wilmer, Linda Xiang, Junjun Zhang, Lincoln D Stein, Vincent Ferretti, Mélanie Courtot, Christina K Yung","doi":"10.1093/gigascience/giaf038","DOIUrl":"https://doi.org/10.1093/gigascience/giaf038","url":null,"abstract":"<p><strong>Background: </strong>Next-generation sequencing has created many new technological challenges in organizing and distributing genomics datasets, which now can routinely reach petabyte scales. Coupled with data-hungry artificial intelligence and machine learning applications, findable, accessible, interoperable, and reusable genomics datasets have never been more valuable. While major archives like the Genomics Data Commons, Sequence Reads Archive, and European Genome-Phenome Archive have improved researchers' ability to share and reuse data, and general-purpose repositories such as Zenodo and Figshare provide valuable platforms for research data publication, the diversity of genomics research precludes any one-size-fits-all approach. In many cases, bespoke solutions are required, and despite funding agencies and journals increasingly mandating reusable data practices, researchers still lack the technical support needed to meet the multifaceted challenges of data reuse.</p><p><strong>Findings: </strong>Overture bridges this gap by providing open-source software for building and deploying customizable genomics data platforms. Its architecture consists of modular microservices, each of which is generalized with narrow responsibilities that together combine to create complete data management systems. These systems enable researchers to organize, share, and explore their genomics data at any scale. Through Overture, researchers can connect their data to both humans and machines, fostering reproducibility and enabling new insights through controlled data sharing and reuse.</p><p><strong>Conclusions: </strong>By making these tools freely available, we can accelerate the development of reliable genomic data management across the research community quickly, flexibly, and at multiple scales. Overture is an open-source project licensed under AGPLv3.0 with all source code publicly available from https://github.com/overture-stack and documentation on development, deployment, and usage available from www.overture.bio.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12020472/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143996787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Telomere-to-telomere genome assembly of Electrophorus electricus provides insights into the evolution of electric eels. 电鳗的端粒到端粒基因组组装提供了对电鳗进化的见解。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf024
Zan Qi, Qun Liu, Haorong Li, Yaolei Zhang, Ziwei Yu, Wenkai Luo, Kun Wang, Yuxin Zhang, Shoupeng Pan, Chao Wang, Hui Jiang, Qiang Qiu, Wen Wang, Guangyi Fan, Yongxin Li
{"title":"Telomere-to-telomere genome assembly of Electrophorus electricus provides insights into the evolution of electric eels.","authors":"Zan Qi, Qun Liu, Haorong Li, Yaolei Zhang, Ziwei Yu, Wenkai Luo, Kun Wang, Yuxin Zhang, Shoupeng Pan, Chao Wang, Hui Jiang, Qiang Qiu, Wen Wang, Guangyi Fan, Yongxin Li","doi":"10.1093/gigascience/giaf024","DOIUrl":"10.1093/gigascience/giaf024","url":null,"abstract":"<p><strong>Background: </strong>Electric eels evolved remarkable electric organs that enable them to instantaneously discharge hundreds of volts for predation, defense, and communication. However, the absence of a high-quality reference genome has extremely constrained the studies of electric eels in various aspects.</p><p><strong>Results: </strong>Using high-depth, multiplatform sequencing data, we successfully assembled the first telomere-to-telomere high-quality reference genome of Electrophorus electricus, which has a genome size of 833.43 Mb and comprises 26 chromosomes. Multiple evaluations, including N50 statistics (30.38 Mb), BUSCO scores (97.30%), and mapping ratio of short-insert sequencing data (99.91%), demonstrate the high contiguity and completeness of the electric eel genome assembly we obtained. Genome annotation predicted 396.63 Mb repetitive sequences and 20,992 protein-coding genes. Furthermore, evolutionary analyses indicate that Gymnotiformes, which the electric eel belongs to, has a closer relationship with Characiformes than Siluriformes and diverged from Characiformes 95.00 million years ago. Pairwise sequentially Markovian coalescent analysis found a sharply decreased trend of the population size of E. electricus over the past few hundred thousand years. Furthermore, many regulatory factors related to neurotransmitters and classical signaling pathways during embryonic development were significantly expanded, potentially contributing to the generation of high-voltage electricity.</p><p><strong>Conclusions: </strong>This study not only provided the first high-quality telomere-to-telomere reference genome of E. electricus but also greatly enhanced our understanding of electric eels.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11959694/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143752095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New implementation of data standards for AI in oncology: Experience from the EuCanImage project. 肿瘤学人工智能数据标准的新实施:来自EuCanImage项目的经验。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae101
Teresa García-Lezana, Maciej Bobowicz, Santiago Frid, Michael Rutherford, Mikel Recuero, Katrine Riklund, Aldar Cabrelles, Marlena Rygusik, Lauren Fromont, Roberto Francischello, Emanuele Neri, Salvador Capella, Arcadi Navarro, Fred Prior, Jonathan Bona, Pilar Nicolas, Martijn P A Starmans, Karim Lekadir, Jordi Rambla
{"title":"New implementation of data standards for AI in oncology: Experience from the EuCanImage project.","authors":"Teresa García-Lezana, Maciej Bobowicz, Santiago Frid, Michael Rutherford, Mikel Recuero, Katrine Riklund, Aldar Cabrelles, Marlena Rygusik, Lauren Fromont, Roberto Francischello, Emanuele Neri, Salvador Capella, Arcadi Navarro, Fred Prior, Jonathan Bona, Pilar Nicolas, Martijn P A Starmans, Karim Lekadir, Jordi Rambla","doi":"10.1093/gigascience/giae101","DOIUrl":"10.1093/gigascience/giae101","url":null,"abstract":"<p><strong>Background: </strong>An unprecedented amount of personal health data, with the potential to revolutionize precision medicine, is generated at health care institutions worldwide. The exploitation of such data using artificial intelligence (AI) relies on the ability to combine heterogeneous, multicentric, multimodal, and multiparametric data, as well as thoughtful representation of knowledge and data availability. Despite these possibilities, significant methodological challenges and ethicolegal constraints still impede the real-world implementation of data models.</p><p><strong>Technical details: </strong>The EuCanImage is an international consortium aimed at developing AI algorithms for precision medicine in oncology and enabling secondary use of the data based on necessary ethical approvals. The use of well-defined clinical data standards to allow interoperability was a central element within the initiative. The consortium is focused on 3 different cancer types and addresses 7 unmet clinical needs. We have conceived and implemented an innovative process to capture clinical data from hospitals, transform it into the newly developed EuCanImage data models, and then store the standardized data in permanent repositories. This new workflow combines recognized software (REDCap for data capture), data standards (FHIR for data structuring), and an existing repository (EGA for permanent data storage and sharing), with newly developed custom tools for data transformation and quality control purposes (ETL pipeline, QC scripts) to complement the gaps.</p><p><strong>Conclusion: </strong>This article synthesizes our experience and procedures for health care data interoperability, standardization, and reproducibility.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12071370/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144010593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial integration of multi-omics data from serial sections using the novel Multi-Omics Imaging Integration Toolset. 使用新颖的多组学成像集成工具集对来自连续切片的多组学数据进行空间集成。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf035
Maximilian Wess, Maria K Andersen, Elise Midtbust, Juan Carlos Cabellos Guillem, Trond Viset, Øystein Størkersen, Sebastian Krossa, Morten Beck Rye, May-Britt Tessem
{"title":"Spatial integration of multi-omics data from serial sections using the novel Multi-Omics Imaging Integration Toolset.","authors":"Maximilian Wess, Maria K Andersen, Elise Midtbust, Juan Carlos Cabellos Guillem, Trond Viset, Øystein Størkersen, Sebastian Krossa, Morten Beck Rye, May-Britt Tessem","doi":"10.1093/gigascience/giaf035","DOIUrl":"https://doi.org/10.1093/gigascience/giaf035","url":null,"abstract":"<p><strong>Background: </strong>Truly understanding the cancer biology of heterogeneous tumors in precision medicine requires capturing the complexities of multiple omics levels and the spatial heterogeneity of cancer tissue. Techniques like mass spectrometry imaging (MSI) and spatial transcriptomics (ST) achieve this by spatially detecting metabolites and RNA but are often applied to serial sections. To fully leverage the advantage of such multi-omics data, the individual measurements need to be integrated into 1 dataset.</p><p><strong>Results: </strong>We present the Multi-Omics Imaging Integration Toolset (MIIT), a Python framework for integrating spatially resolved multi-omics data. A key component of MIIT's integration is the registration of serial sections for which we developed a nonrigid registration algorithm, GreedyFHist. We validated GreedyFHist on 244 images from fresh-frozen serial sections, achieving state-of-the-art performance. As a proof of concept, we used MIIT to integrate ST and MSI data from prostate tissue samples and assessed the correlation of a gene signature for citrate-spermine secretion derived from ST with metabolic measurements from MSI.</p><p><strong>Conclusion: </strong>MIIT is a highly accurate, customizable, open-source framework for integrating spatial omics technologies performed on different serial sections.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12077394/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144076950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An ecosystem for producing and sharing metadata within the web of FAIR Data. 一个在FAIR数据网络中生成和共享元数据的生态系统。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae111
Daniel Jacob, François Ehrenmann, Romain David, Joseph Tran, Cathleen Mirande-Ney, Philippe Chaumeil
{"title":"An ecosystem for producing and sharing metadata within the web of FAIR Data.","authors":"Daniel Jacob, François Ehrenmann, Romain David, Joseph Tran, Cathleen Mirande-Ney, Philippe Chaumeil","doi":"10.1093/gigascience/giae111","DOIUrl":"https://doi.org/10.1093/gigascience/giae111","url":null,"abstract":"<p><strong>Background: </strong>Descriptive metadata are vital for reporting, discovering, leveraging, and mobilizing research datasets. However, resolving metadata issues as part of a data management plan can be complex for data producers. To organize and document data, various descriptive metadata must be created. Furthermore, when sharing data, it is important to ensure metadata interoperability in line with FAIR (Findable, Accessible, Interoperable, Reusable) principles. Given the practical nature of these challenges, there is a need for management tools that can assist data managers effectively. Additionally, these tools should meet the needs of data producers and be user-friendly, requiring minimal training.</p><p><strong>Results: </strong>We developed Maggot (Metadata Aggregation on Data Storage), a web-based tool to locally manage a data catalog using high-level metadata. The main goal was to facilitate easy data dissemination and deposition in data repositories. With Maggot, users can easily generate and attach high-level metadata to datasets, allowing for seamless sharing in a collaborative environment. This approach aligns with many data management plans as it effectively addresses challenges related to data organization, documentation, storage, and the sharing of metadata based on FAIR principles within and beyond the collaborative group. Furthermore, Maggot enables metadata crosswalks (i.e., generated metadata can be converted to the schema used by a specific data repository or be exported using a format suitable for data collection by third-party applications).</p><p><strong>Conclusion: </strong>The primary purpose of Maggot is to streamline the collection of high-level metadata using carefully chosen schemas and standards. Additionally, it simplifies data accessibility via metadata, typically a requirement for publicly funded projects. As a result, Maggot can be utilized to promote effective local management with the goal of facilitating data sharing while adhering to the FAIR principles. Furthermore, it can contribute to the preparation of the future EOSC FAIR Web of Data within the European Open Science Cloud framework.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11707607/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142947509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A near telomere-to-telomere genome assembly of the Jinhua pig: enabling more accurate genetic research. 金华猪的近端粒到端粒基因组组装:使更准确的遗传研究。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf048
Caiyun Cao, Jian Miao, Qinqin Xie, Jiabao Sun, Hong Cheng, Zhenyang Zhang, Fen Wu, Shuang Liu, Xiaowei Ye, Huanfa Gong, Zhe Zhang, Qishan Wang, Yuchun Pan, Zhen Wang
{"title":"A near telomere-to-telomere genome assembly of the Jinhua pig: enabling more accurate genetic research.","authors":"Caiyun Cao, Jian Miao, Qinqin Xie, Jiabao Sun, Hong Cheng, Zhenyang Zhang, Fen Wu, Shuang Liu, Xiaowei Ye, Huanfa Gong, Zhe Zhang, Qishan Wang, Yuchun Pan, Zhen Wang","doi":"10.1093/gigascience/giaf048","DOIUrl":"https://doi.org/10.1093/gigascience/giaf048","url":null,"abstract":"<p><strong>Background: </strong>Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and centromeres and telomeres missing, which limits our understanding of the important traits in these genomic regions.</p><p><strong>Findings: </strong>We present a near-complete genome assembly for the Jinhua pig (JH-T2T) and provide a set of diploid Jinhua reference genomes, constructed using PacBio HiFi, ONT long reads, and Hi-C reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only 6 gaps. It features annotations of 46.90% repetitive sequences, 33 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes and loses 114 genes. Moreover, it enhances the mapping rate for both Western and Chinese local pigs, outperforming Sscrofa11.1 as a reference genome. Additionally, this comprehensive genome assembly will facilitate large-scale variant detection.</p><p><strong>Conclusions: </strong>This study produced a near-gapless assembly of the pig genome and provides a set of haploid Jinhua reference genomes. Our findings represent a significant advance in pig genomics, providing a robust resource that enhances genetic research, breeding programs, and biomedical applications.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12080228/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信