GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae035
Jonas Bömer, Felix Esser, Elias Marks, Radu Alexandru Rosu, Sven Behnke, Lasse Klingbeil, Heiner Kuhlmann, Cyrill Stachniss, Anne-Katrin Mahlein, Stefan Paulus
{"title":"A 3D printed plant model for accurate and reliable 3D plant phenotyping.","authors":"Jonas Bömer, Felix Esser, Elias Marks, Radu Alexandru Rosu, Sven Behnke, Lasse Klingbeil, Heiner Kuhlmann, Cyrill Stachniss, Anne-Katrin Mahlein, Stefan Paulus","doi":"10.1093/gigascience/giae035","DOIUrl":"10.1093/gigascience/giae035","url":null,"abstract":"<p><strong>Background: </strong>This study addresses the importance of precise referencing in 3-dimensional (3D) plant phenotyping, which is crucial for advancing plant breeding and improving crop production. Traditionally, reference data in plant phenotyping rely on invasive methods. Recent advancements in 3D sensing technologies offer the possibility to collect parameters that cannot be referenced by manual measurements. This work focuses on evaluating a 3D printed sugar beet plant model as a referencing tool.</p><p><strong>Results: </strong>Fused deposition modeling has turned out to be a suitable 3D printing technique for creating reference objects in 3D plant phenotyping. Production deviations of the created reference model were in a low and acceptable range. We were able to achieve deviations ranging from -10 mm to +5 mm. In parallel, we demonstrated a high-dimensional stability of the reference model, reaching only ±4 mm deformation over the course of 1 year. Detailed print files, assembly descriptions, and benchmark parameters are provided, facilitating replication and benefiting the research community.</p><p><strong>Conclusion: </strong>Consumer-grade 3D printing was utilized to create a stable and reproducible 3D reference model of a sugar beet plant, addressing challenges in referencing morphological parameters in 3D plant phenotyping. The reference model is applicable in 3 demonstrated use cases: evaluating and comparing 3D sensor systems, investigating the potential accuracy of parameter extraction algorithms, and continuously monitoring these algorithms in practical experiments in greenhouse and field experiments. Using this approach, it is possible to monitor the extraction of a nonverifiable parameter and create reference data. The process serves as a model for developing reference models for other agricultural crops.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11186670/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141426681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae057
Thomas Roetzer-Pejrimovsky, Karl-Heinz Nenning, Barbara Kiesel, Johanna Klughammer, Martin Rajchl, Bernhard Baumann, Georg Langs, Adelheid Woehrer
{"title":"Deep learning links localized digital pathology phenotypes with transcriptional subtype and patient outcome in glioblastoma.","authors":"Thomas Roetzer-Pejrimovsky, Karl-Heinz Nenning, Barbara Kiesel, Johanna Klughammer, Martin Rajchl, Bernhard Baumann, Georg Langs, Adelheid Woehrer","doi":"10.1093/gigascience/giae057","DOIUrl":"10.1093/gigascience/giae057","url":null,"abstract":"<p><strong>Background: </strong>Deep learning has revolutionized medical image analysis in cancer pathology, where it had a substantial clinical impact by supporting the diagnosis and prognostic rating of cancer. Among the first available digital resources in the field of brain cancer is glioblastoma, the most common and fatal brain cancer. At the histologic level, glioblastoma is characterized by abundant phenotypic variability that is poorly linked with patient prognosis. At the transcriptional level, 3 molecular subtypes are distinguished with mesenchymal-subtype tumors being associated with increased immune cell infiltration and worse outcome.</p><p><strong>Results: </strong>We address genotype-phenotype correlations by applying an Xception convolutional neural network to a discovery set of 276 digital hematozylin and eosin (H&E) slides with molecular subtype annotation and an independent The Cancer Genome Atlas-based validation cohort of 178 cases. Using this approach, we achieve high accuracy in H&E-based mapping of molecular subtypes (area under the curve for classical, mesenchymal, and proneural = 0.84, 0.81, and 0.71, respectively; P < 0.001) and regions associated with worse outcome (univariable survival model P < 0.001, multivariable P = 0.01). The latter were characterized by higher tumor cell density (P < 0.001), phenotypic variability of tumor cells (P < 0.001), and decreased T-cell infiltration (P = 0.017).</p><p><strong>Conclusions: </strong>We modify a well-known convolutional neural network architecture for glioblastoma digital slides to accurately map the spatial distribution of transcriptional subtypes and regions predictive of worse outcome, thereby showcasing the relevance of artificial intelligence-enabled image mining in brain cancer.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11345537/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142055362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae066
Lauren M Sanders,Danielle K Lopez,Alan E Wood,Ryan T Scott,Samrawit G Gebre,Amanda M Saravia-Butler,Sylvain V Costes
{"title":"Celebrating 30 years of access to NASA Space Life Sciences data.","authors":"Lauren M Sanders,Danielle K Lopez,Alan E Wood,Ryan T Scott,Samrawit G Gebre,Amanda M Saravia-Butler,Sylvain V Costes","doi":"10.1093/gigascience/giae066","DOIUrl":"https://doi.org/10.1093/gigascience/giae066","url":null,"abstract":"NASA's space life sciences research programs established a decades-long legacy of enhancing our ability to safely explore the cosmos. From Skylab and the Space Shuttle Program to the NASA Balloon Program and the International Space Station National Lab, these programs generated priceless data that continue to paint a vibrant picture of life in space. These data are available to the scientific community in various data repositories, including the NASA Ames Life Sciences Data Archive (ALSDA) and NASA GeneLab. Here we recognize the 30-year anniversary of data access through ALSDA and the 10-year anniversary of GeneLab.","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":9.2,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142259659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giad112
Nicolás Gaitán, Jorge Duitama
{"title":"A graph clustering algorithm for detection and genotyping of structural variants from long reads.","authors":"Nicolás Gaitán, Jorge Duitama","doi":"10.1093/gigascience/giad112","DOIUrl":"10.1093/gigascience/giad112","url":null,"abstract":"<p><strong>Background: </strong>Structural variants (SVs) are genomic polymorphisms defined by their length (>50 bp). The usual types of SVs are deletions, insertions, translocations, inversions, and copy number variants. SV detection and genotyping is fundamental given the role of SVs in phenomena such as phenotypic variation and evolutionary events. Thus, methods to identify SVs using long-read sequencing data have been recently developed.</p><p><strong>Findings: </strong>We present an accurate and efficient algorithm to predict germline SVs from long-read sequencing data. The algorithm starts collecting evidence (signatures) of SVs from read alignments. Then, signatures are clustered based on a Euclidean graph with coordinates calculated from lengths and genomic positions. Clustering is performed by the DBSCAN algorithm, which provides the advantage of delimiting clusters with high resolution. Clusters are transformed into SVs and a Bayesian model allows to precisely genotype SVs based on their supporting evidence. This algorithm is integrated into the single sample variants detector of the Next Generation Sequencing Experience Platform, which facilitates the integration with other functionalities for genomics analysis. We performed multiple benchmark experiments, including simulation and real data, representing different genome profiles, sequencing technologies (PacBio HiFi, ONT), and read depths.</p><p><strong>Conclusion: </strong>The results show that our approach outperformed state-of-the-art tools on germline SV calling and genotyping, especially at low depths, and in error-prone repetitive regions. We believe this work significantly contributes to the development of bioinformatic strategies to maximize the use of long-read sequencing technologies.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10783151/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139416802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giad120
Justin Sonneck, Yu Zhou, Jianxu Chen
{"title":"MMV_Im2Im: an open-source microscopy machine vision toolbox for image-to-image transformation.","authors":"Justin Sonneck, Yu Zhou, Jianxu Chen","doi":"10.1093/gigascience/giad120","DOIUrl":"10.1093/gigascience/giad120","url":null,"abstract":"<p><p>Over the past decade, deep learning (DL) research in computer vision has been growing rapidly, with many advances in DL-based image analysis methods for biomedical problems. In this work, we introduce MMV_Im2Im, a new open-source Python package for image-to-image transformation in bioimaging applications. MMV_Im2Im is designed with a generic image-to-image transformation framework that can be used for a wide range of tasks, including semantic segmentation, instance segmentation, image restoration, image generation, and so on. Our implementation takes advantage of state-of-the-art machine learning engineering techniques, allowing researchers to focus on their research without worrying about engineering details. We demonstrate the effectiveness of MMV_Im2Im on more than 10 different biomedical problems, showcasing its general potentials and applicabilities. For computational biomedical researchers, MMV_Im2Im provides a starting point for developing new biomedical image analysis or machine learning algorithms, where they can either reuse the code in this package or fork and extend this package to facilitate the development of new methods. Experimental biomedical researchers can benefit from this work by gaining a comprehensive view of the image-to-image transformation concept through diversified examples and use cases. We hope this work can give the community inspirations on how DL-based image-to-image transformation can be integrated into the assay development process, enabling new biomedical studies that cannot be done only with traditional experimental assays. To help researchers get started, we have provided source code, documentation, and tutorials for MMV_Im2Im at [https://github.com/MMV-Lab/mmv_im2im] under MIT license.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10821710/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139570421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae003
Lei Cao, Chao Yang, Luni Hu, Wenjian Jiang, Yating Ren, Tianyi Xia, Mengyang Xu, Yishuai Ji, Mei Li, Xun Xu, Yuxiang Li, Yong Zhang, Shuangsang Fang
{"title":"Deciphering spatial domains from spatially resolved transcriptomics with Siamese graph autoencoder.","authors":"Lei Cao, Chao Yang, Luni Hu, Wenjian Jiang, Yating Ren, Tianyi Xia, Mengyang Xu, Yishuai Ji, Mei Li, Xun Xu, Yuxiang Li, Yong Zhang, Shuangsang Fang","doi":"10.1093/gigascience/giae003","DOIUrl":"10.1093/gigascience/giae003","url":null,"abstract":"<p><strong>Background: </strong>Cell clustering is a pivotal aspect of spatial transcriptomics (ST) data analysis as it forms the foundation for subsequent data mining. Recent advances in spatial domain identification have leveraged graph neural network (GNN) approaches in conjunction with spatial transcriptomics data. However, such GNN-based methods suffer from representation collapse, wherein all spatial spots are projected onto a singular representation. Consequently, the discriminative capability of individual representation feature is limited, leading to suboptimal clustering performance.</p><p><strong>Results: </strong>To address this issue, we proposed SGAE, a novel framework for spatial domain identification, incorporating the power of the Siamese graph autoencoder. SGAE mitigates the information correlation at both sample and feature levels, thus improving the representation discrimination. We adapted this framework to ST analysis by constructing a graph based on both gene expression and spatial information. SGAE outperformed alternative methods by its effectiveness in capturing spatial patterns and generating high-quality clusters, as evaluated by the Adjusted Rand Index, Normalized Mutual Information, and Fowlkes-Mallows Index. Moreover, the clustering results derived from SGAE can be further utilized in the identification of 3-dimensional (3D) Drosophila embryonic structure with enhanced accuracy.</p><p><strong>Conclusions: </strong>Benchmarking results from various ST datasets generated by diverse platforms demonstrate compelling evidence for the effectiveness of SGAE against other ST clustering methods. Specifically, SGAE exhibits potential for extension and application on multislice 3D reconstruction and tissue structure investigation. The source code and a collection of spatial clustering results can be accessed at https://github.com/STOmics/SGAE/.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10939418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139905504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae029
Scott Ferguson, Ashley Jones, Kevin Murray, Rose L Andrew, Benjamin Schwessinger, Helen Bothwell, Justin Borevitz
{"title":"Exploring the role of polymorphic interspecies structural variants in reproductive isolation and adaptive divergence in Eucalyptus.","authors":"Scott Ferguson, Ashley Jones, Kevin Murray, Rose L Andrew, Benjamin Schwessinger, Helen Bothwell, Justin Borevitz","doi":"10.1093/gigascience/giae029","DOIUrl":"10.1093/gigascience/giae029","url":null,"abstract":"<p><p>Structural variations (SVs) play a significant role in speciation and adaptation in many species, yet few studies have explored the prevalence and impact of different categories of SVs. We conducted a comparative analysis of long-read assembled reference genomes of closely related Eucalyptus species to identify candidate SVs potentially influencing speciation and adaptation. Interspecies SVs can be either fixed differences or polymorphic in one or both species. To describe SV patterns, we employed short-read whole-genome sequencing on over 600 individuals of Eucalyptus melliodora and Eucalyptus sideroxylon, along with recent high-quality genome assemblies. We aligned reads and genotyped interspecies SVs predicted between species reference genomes. Our results revealed that 49,756 of 58,025 and 39,536 of 47,064 interspecies SVs could be typed with short reads in E. melliodora and E. sideroxylon, respectively. Focusing on inversions and translocations, symmetric SVs that are readily genotyped within both populations, 24 were found to be structural divergences, 2,623 structural polymorphisms, and 928 shared structural polymorphisms. We assessed the functional significance of fixed interspecies SVs by examining differences in estimated recombination rates and genetic differentiation between species, revealing a complex history of natural selection. Shared structural polymorphisms displayed enrichment of potentially adaptive genes. Understanding how different classes of genetic mutations contribute to genetic diversity and reproductive barriers is essential for understanding how organisms enhance fitness, adapt to changing environments, and diversify. Our findings reveal the prevalence of interspecies SVs and elucidate their role in genetic differentiation, adaptive evolution, and species divergence within and between populations.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170218/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141310458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae026
Bryan Raubenolt, Daniel Blankenberg
{"title":"Generalized open-source workflows for atomistic molecular dynamics simulations of viral helicases.","authors":"Bryan Raubenolt, Daniel Blankenberg","doi":"10.1093/gigascience/giae026","DOIUrl":"10.1093/gigascience/giae026","url":null,"abstract":"<p><p>Viral helicases are promising targets for the development of antiviral therapies. Given their vital function of unwinding double-stranded nucleic acids, inhibiting them blocks the viral replication cycle. Previous studies have elucidated key structural details of these helicases, including the location of substrate binding sites, flexible domains, and the discovery of potential inhibitors. Here we present a series of new Galaxy tools and workflows for performing and analyzing molecular dynamics simulations of viral helicases. We first validate them by demonstrating recapitulation of data from previous simulations of Zika (NS3) and SARS-CoV-2 (NSP13) helicases in apo and complex with inhibitors. We further demonstrate the utility and generalizability of these Galaxy workflows by applying them to new cases, proving their usefulness as a widely accessible method for exploring antiviral activity.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170216/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141310459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae039
Sokratis Kariotis, Pei Fang Tan, Haiping Lu, Christopher J Rhodes, Martin R Wilkins, Allan Lawrie, Dennis Wang
{"title":"Omada: robust clustering of transcriptomes through multiple testing.","authors":"Sokratis Kariotis, Pei Fang Tan, Haiping Lu, Christopher J Rhodes, Martin R Wilkins, Allan Lawrie, Dennis Wang","doi":"10.1093/gigascience/giae039","DOIUrl":"10.1093/gigascience/giae039","url":null,"abstract":"<p><strong>Background: </strong>Cohort studies increasingly collect biosamples for molecular profiling and are observing molecular heterogeneity. High-throughput RNA sequencing is providing large datasets capable of reflecting disease mechanisms. Clustering approaches have produced a number of tools to help dissect complex heterogeneous datasets, but selecting the appropriate method and parameters to perform exploratory clustering analysis of transcriptomic data requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent. To address this, we have developed Omada, a suite of tools aiming to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning-based functions.</p><p><strong>Findings: </strong>The efficiency of each tool was tested with 7 datasets characterized by different expression signal strengths to capture a wide spectrum of RNA expression datasets. Our toolkit's decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Within datasets with less clear biological distinctions, our tools either formed stable subgroups with different expression profiles and robust clinical associations or revealed signs of problematic data such as biased measurements.</p><p><strong>Conclusions: </strong>In conclusion, Omada successfully automates the robust unsupervised clustering of transcriptomic data, making advanced analysis accessible and reliable even for those without extensive machine learning expertise. Implementation of Omada is available at http://bioconductor.org/packages/omada/.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11238428/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141590107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae044
Rui Tang, Cong Huang, Jun Yang, Zhong-Chen Rao, Li Cao, Peng-Hua Bai, Xin-Cheng Zhao, Jun-Feng Dong, Xi-Zhong Yan, Fang-Hao Wan, Nan-Ji Jiang, Ri-Chou Han
{"title":"A ghost moth olfactory prototype of the lepidopteran sex communication.","authors":"Rui Tang, Cong Huang, Jun Yang, Zhong-Chen Rao, Li Cao, Peng-Hua Bai, Xin-Cheng Zhao, Jun-Feng Dong, Xi-Zhong Yan, Fang-Hao Wan, Nan-Ji Jiang, Ri-Chou Han","doi":"10.1093/gigascience/giae044","DOIUrl":"10.1093/gigascience/giae044","url":null,"abstract":"<p><p>Sex role differentiation is a widespread phenomenon. Sex pheromones are often associated with sex roles and convey sex-specific information. In Lepidoptera, females release sex pheromones to attract males, which evolve sophisticated olfactory structures to relay pheromone signals. However, in some primitive moths, sex role differentiation becomes diverged. Here, we introduce the chromosome-level genome assembly from ancestral Himalaya ghost moths, revealing a unique olfactory evolution pattern and sex role parity among Lepidoptera. These olfactory structures of the ghost moths are characterized by a dense population of trichoid sensilla, both larger male and female antennal entry parts of brains, compared to the evolutionary later Lepidoptera. Furthermore, a unique tandem of 34 odorant receptor 19 homologs in Thitarodes xiaojinensis (TxiaOr19) has been identified, which presents overlapped motifs with pheromone receptors (PRs). Interestingly, the expanded TxiaOr19 was predicted to have unconventional tuning patterns compared to canonical PRs, with nonsexual dimorphic olfactory neuropils discovered, which contributes to the observed equal sex roles in Thitarodes adults. Additionally, transposable element activity bursts have provided traceable loci landscapes where parallel diversifications occurred between TxiaOr19 and PRs, indicating that the Or19 homolog expansions were diversified to PRs during evolution and thus established the classic sex roles in higher moths. This study elucidates an olfactory prototype of intermediate sex communication from Himalaya ghost moths.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11258902/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141727097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}