Genome BiologyPub Date : 2024-09-04DOI: 10.1186/s13059-024-03387-4
Alessandro Vinceti, Rafaele M. Iannuzzi, Isabella Boyle, Lucia Trastulla, Catarina D. Campbell, Francisca Vazquez, Joshua M. Dempster, Francesco Iorio
{"title":"Author Correction: A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data","authors":"Alessandro Vinceti, Rafaele M. Iannuzzi, Isabella Boyle, Lucia Trastulla, Catarina D. Campbell, Francisca Vazquez, Joshua M. Dempster, Francesco Iorio","doi":"10.1186/s13059-024-03387-4","DOIUrl":"https://doi.org/10.1186/s13059-024-03387-4","url":null,"abstract":"<p><b>Correction</b><b>: </b><b>Genome Biol 25, 192 (2024)</b></p><p><b>https://doi.org/10.1186/s13059-024-03336-1</b></p><br/><p>Following publication of the original article [1], the authors identified an omission in the completing interests section. The omitted text is given in bold below.</p><p><b>Competing interests</b></p><p>FI receives funding from Open Targets, a public-private initiative involving academia and industry and performs consultancy for the joint CRUK-AstraZeneca Functional Genomics Centre and for Mosaic TX. JD is a consultant for and holds equity in Jumble Therapeutics. CDC performs consultancy for Droplet Biosciences and is a shareholder of Novartis. <b>FV receives research support from the Dependency Map Consortium, Riva Therapeutics, Bristol Myers Squibb, Merck, Illumina, and Deerfield Management. FV is on the scientific advisory board of GSK, is a consultant and holds equity in Riva Therapeutics and is a co-founder and holds equity in Jumble Therapeutics</b>. All other authors declare that they have no competing interests.</p><p>The original article [1] is corrected.</p><ol data-track-component=\"outbound reference\" data-track-context=\"references section\"><li data-counter=\"1.\"><p>Vinceti A, Iannuzzi RM, Boyle I, et al. A benchmark of computational methods for correcting biases of established and unknown origin in CRISPR-Cas9 screening data. Genome Biol. 2024;25:192. https://doi.org/10.1186/s13059-024-03336-1.</p><p>Article PubMed PubMed Central Google Scholar </p></li></ol><p>Download references<svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" role=\"img\" width=\"16\"><use xlink:href=\"#icon-eds-i-download-medium\" xmlns:xlink=\"http://www.w3.org/1999/xlink\"></use></svg></p><h3>Authors and Affiliations</h3><ol><li><p>Computational Biology Research Centre, Human Technopole, Milan, Italy</p><p>Alessandro Vinceti, Rafaele M. Iannuzzi, Lucia Trastulla & Francesco Iorio</p></li><li><p>Broad Institute of Harvard and MIT, Cambridge, MA, USA</p><p>Isabella Boyle, Catarina D. Campbell, Francisca Vazquez & Joshua M. Dempster</p></li></ol><span>Authors</span><ol><li><span>Alessandro Vinceti</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Rafaele M. Iannuzzi</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Isabella Boyle</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Lucia Trastulla</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Catarina D. Campbell</span>View author publications<p>You can also search for this author in <span>PubMed<span> </span>Google Scholar</span></p></li><li><span>Francisca Vazquez</span>View author publications<p>You can also search for this author in <","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"150 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142130839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2024-09-03DOI: 10.1186/s13059-024-03377-6
Sean M. Flynn, Somdutta Dhir, Krzysztof Herka, Colm Doyle, Larry Melidis, Angela Simeone, Winnie W. I. Hui, Rafael de Cesaris Araujo Tavares, Stefan Schoenfelder, David Tannahill, Shankar Balasubramanian
{"title":"Improved simultaneous mapping of epigenetic features and 3D chromatin structure via ViCAR","authors":"Sean M. Flynn, Somdutta Dhir, Krzysztof Herka, Colm Doyle, Larry Melidis, Angela Simeone, Winnie W. I. Hui, Rafael de Cesaris Araujo Tavares, Stefan Schoenfelder, David Tannahill, Shankar Balasubramanian","doi":"10.1186/s13059-024-03377-6","DOIUrl":"https://doi.org/10.1186/s13059-024-03377-6","url":null,"abstract":"Methods to measure chromatin contacts at genomic regions bound by histone modifications or proteins are important tools to investigate chromatin organization. However, such methods do not capture the possible involvement of other epigenomic features such as G-quadruplex DNA secondary structures (G4s). To bridge this gap, we introduce ViCAR (viewpoint HiCAR), for the direct antibody-based capture of chromatin interactions at folded G4s. Through ViCAR, we showcase the first G4-3D interaction landscape. Using histone marks, we also demonstrate how ViCAR improves on earlier approaches yielding increased signal-to-noise. ViCAR is a practical and powerful tool to explore epigenetic marks and 3D genome interactomes.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"9 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142123681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2024-09-03DOI: 10.1186/s13059-024-03376-7
Brennan H. Baker, Sheela Sathyanarayana, Adam A. Szpiro, James W. MacDonald, Alison G. Paquette
{"title":"RNAseqCovarImpute: a multiple imputation procedure that outperforms complete case and single imputation differential expression analysis","authors":"Brennan H. Baker, Sheela Sathyanarayana, Adam A. Szpiro, James W. MacDonald, Alison G. Paquette","doi":"10.1186/s13059-024-03376-7","DOIUrl":"https://doi.org/10.1186/s13059-024-03376-7","url":null,"abstract":"Missing covariate data is a common problem that has not been addressed in observational studies of gene expression. Here, we present a multiple imputation method that accommodates high dimensional gene expression data by incorporating principal component analysis of the transcriptome into the multiple imputation prediction models to avoid bias. Simulation studies using three datasets show that this method outperforms complete case and single imputation analyses at uncovering true positive differentially expressed genes, limiting false discovery rates, and minimizing bias. This method is easily implemented via an R Bioconductor package, RNAseqCovarImpute that integrates with the limma-voom pipeline for differential expression analysis.\u0000","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"15 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142123682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2024-09-02DOI: 10.1186/s13059-024-03374-9
Olivier B. Poirion, Wulin Zuo, Catrina Spruce, Candice N. Baker, Sandra L. Daigle, Ashley Olson, Daniel A. Skelly, Elissa J. Chesler, Christopher L. Baker, Brian S. White
{"title":"Enhlink infers distal and context-specific enhancer–promoter linkages","authors":"Olivier B. Poirion, Wulin Zuo, Catrina Spruce, Candice N. Baker, Sandra L. Daigle, Ashley Olson, Daniel A. Skelly, Elissa J. Chesler, Christopher L. Baker, Brian S. White","doi":"10.1186/s13059-024-03374-9","DOIUrl":"https://doi.org/10.1186/s13059-024-03374-9","url":null,"abstract":"Enhlink is a computational tool for scATAC-seq data analysis, facilitating precise interrogation of enhancer function at the single-cell level. It employs an ensemble approach incorporating technical and biological covariates to infer condition-specific regulatory DNA linkages. Enhlink can integrate multi-omic data for enhanced specificity, when available. Evaluation with simulated and real data, including multi-omic datasets from the mouse striatum and novel promoter capture Hi-C data, demonstrate that Enhlink outperfoms alternative methods. Coupled with eQTL analysis, it identified a putative super-enhancer in striatal neurons. Overall, Enhlink offers accuracy, power, and potential for revealing novel biological insights in gene regulation.\u0000","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"8 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142118216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2024-08-29DOI: 10.1186/s13059-024-03372-x
Feng Zhang, Chenkun Yang, Hao Guo, Yufei Li, Shuangqian Shen, Qianqian Zhou, Chun Li, Chao Wang, Ting Zhai, Lianghuan Qu, Cheng Zhang, Xianqing Liu, Jie Luo, Wei Chen, Shouchuang Wang, Jun Yang, Cui Yu, Yanyan Liu
{"title":"Dissecting the genetic basis of UV-B responsive metabolites in rice","authors":"Feng Zhang, Chenkun Yang, Hao Guo, Yufei Li, Shuangqian Shen, Qianqian Zhou, Chun Li, Chao Wang, Ting Zhai, Lianghuan Qu, Cheng Zhang, Xianqing Liu, Jie Luo, Wei Chen, Shouchuang Wang, Jun Yang, Cui Yu, Yanyan Liu","doi":"10.1186/s13059-024-03372-x","DOIUrl":"https://doi.org/10.1186/s13059-024-03372-x","url":null,"abstract":"UV-B, an important environmental factor, has been shown to affect the yield and quality of rice (Oryza sativa) worldwide. However, the molecular mechanisms underlying the response to UV-B stress remain elusive in rice. We perform comprehensive metabolic profiling of leaves from 160 diverse rice accessions under UV-B and normal light conditions using a widely targeted metabolomics approach. Our results reveal substantial differences in metabolite accumulation between the two major rice subspecies indica and japonica, especially after UV-B treatment, implying the possible role and mechanism of metabolome changes in subspecies differentiation and the stress response. We next conduct a transcriptome analysis from four representative rice varieties under UV-B stress, revealing genes from amino acid and flavonoid pathways involved in the UV-B response. We further perform a metabolite-based genome-wide association study (mGWAS), which reveals 3307 distinct loci under UV-B stress. Identification and functional validation of candidate genes show that OsMYB44 regulates tryptamine accumulation to mediate UV-B tolerance, while OsUVR8 interacts with OsMYB110 to promote flavonoid accumulation and UV-B tolerance in a coordinated manner. Additionally, haplotype analysis suggests that natural variation of OsUVR8groupA contributes to UV-B resistance in rice. Our study reveals the complex biochemical and genetic foundations that govern the metabolite dynamics underlying the response, tolerance, and adaptive strategies of rice to UV-B stress. These findings provide new insights into the biochemical and genetic basis of the metabolome underlying the crop response, tolerance, and adaptation to UV-B stress.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"17 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142090107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2024-08-28DOI: 10.1186/s13059-024-03375-8
Luke Saville, Li Wu, Jemaneh Habtewold, Yubo Cheng, Babita Gollen, Liam Mitchell, Matthew Stuart-Edwards, Travis Haight, Majid Mohajerani, Athanasios Zovoilis
{"title":"NERD-seq: a novel approach of Nanopore direct RNA sequencing that expands representation of non-coding RNAs","authors":"Luke Saville, Li Wu, Jemaneh Habtewold, Yubo Cheng, Babita Gollen, Liam Mitchell, Matthew Stuart-Edwards, Travis Haight, Majid Mohajerani, Athanasios Zovoilis","doi":"10.1186/s13059-024-03375-8","DOIUrl":"https://doi.org/10.1186/s13059-024-03375-8","url":null,"abstract":"Non-coding RNAs (ncRNAs) are frequently documented RNA modification substrates. Nanopore Technologies enables the direct sequencing of RNAs and the detection of modified nucleobases. Ordinarily, direct RNA sequencing uses polyadenylation selection, studying primarily mRNA gene expression. Here, we present NERD-seq, which enables detection of multiple non-coding RNAs, excluded by the standard approach, alongside natively polyadenylated transcripts. Using neural tissues as a proof of principle, we show that NERD-seq expands representation of frequently modified non-coding RNAs, such as snoRNAs, snRNAs, scRNAs, srpRNAs, tRNAs, and rRFs. NERD-seq represents an RNA-seq approach to simultaneously study mRNA and ncRNA epitranscriptomes in brain tissues and beyond.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"304 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142085504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gut microbiota contributes to high-altitude hypoxia acclimatization of human populations","authors":"Qian Su, Dao-Hua Zhuang, Yu-Chun Li, Yu Chen, Xia-Yan Wang, Ming-Xia Ge, Ting-Yue Xue, Qi-Yuan Zhang, Xin-Yuan Liu, Fan-Qian Yin, Yi-Ming Han, Zong-Liang Gao, Long Zhao, Yong-Xuan Li, Meng-Jiao Lv, Li-Qin Yang, Tian-Rui Xia, Yong-Jun Luo, Zhigang Zhang, Qing-Peng Kong","doi":"10.1186/s13059-024-03373-w","DOIUrl":"https://doi.org/10.1186/s13059-024-03373-w","url":null,"abstract":"The relationship between human gut microbiota and high-altitude hypoxia acclimatization remains highly controversial. This stems primarily from uncertainties regarding both the potential temporal changes in the microbiota under such conditions and the existence of any dominant or core bacteria that may assist in host acclimatization. To address these issues, and to control for variables commonly present in previous studies which significantly impact the results obtained, namely genetic background, ethnicity, lifestyle, and diet, we conducted a 108-day longitudinal study on the same cohort comprising 45 healthy Han adults who traveled from lowland Chongqing, 243 masl, to high-altitude plateau Lhasa, Xizang, 3658 masl, and back. Using shotgun metagenomic profiling, we study temporal changes in gut microbiota composition at different timepoints. The results show a significant reduction in the species and functional diversity of the gut microbiota, along with a marked increase in functional redundancy. These changes are primarily driven by the overgrowth of Blautia A, a genus that is also abundant in six independent Han cohorts with long-term duration in lower hypoxia environment in Shigatse, Xizang, at 4700 masl. Further animal experiments indicate that Blautia A-fed mice exhibit enhanced intestinal health and a better acclimatization phenotype to sustained hypoxic stress. Our study underscores the importance of Blautia A species in the gut microbiota’s rapid response to high-altitude hypoxia and its potential role in maintaining intestinal health and aiding host adaptation to extreme environments, likely via anti-inflammation and intestinal barrier protection.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"50 7-8 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142085505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2024-08-27DOI: 10.1186/s13059-024-03370-z
Tianpeng Wang, Aalt D J van Dijk, Ranze Zhao, Guusje Bonnema, Xiaowu Wang
{"title":"Contribution of homoeologous exchange to domestication of polyploid Brassica.","authors":"Tianpeng Wang, Aalt D J van Dijk, Ranze Zhao, Guusje Bonnema, Xiaowu Wang","doi":"10.1186/s13059-024-03370-z","DOIUrl":"10.1186/s13059-024-03370-z","url":null,"abstract":"<p><strong>Background: </strong>Polyploidy is widely recognized as a significant evolutionary force in the plant kingdom, contributing to the diversification of plants. One of the notable features of allopolyploidy is the occurrence of homoeologous exchange (HE) events between the subgenomes, causing changes in genomic composition, gene expression, and phenotypic variations. However, the role of HE in plant adaptation and domestication remains unclear.</p><p><strong>Results: </strong>Here we analyze the whole-genome resequencing data from Brassica napus accessions representing the different morphotypes and ecotypes, to investigate the role of HE in domestication. Our findings demonstrate frequent occurrence of HEs in Brassica napus, with substantial HE patterns shared across populations, indicating their potential role in promoting crop domestication. HE events are asymmetric, with the A genome more frequently replacing C genome segments. These events show a preference for specific genomic regions and vary among populations. We also identify candidate genes in HE regions specific to certain populations, which likely contribute to flowering-time diversification across diverse morphotypes and ecotypes. In addition, we assemble a new genome of a swede accession, confirming the HE signals on the genome and their potential involvement in root tuber development. By analyzing HE in another allopolyploid species, Brassica juncea, we characterize a potential broader role of HE in allopolyploid crop domestication.</p><p><strong>Conclusions: </strong>Our results provide novel insights into the domestication of polyploid Brassica species and highlight homoeologous exchange as a crucial mechanism for generating variations that are selected for crop improvement in polyploid species.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"25 1","pages":"231"},"PeriodicalIF":10.1,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11350971/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142080066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genome BiologyPub Date : 2024-08-22DOI: 10.1186/s13059-024-03355-y
Gabriel Innocenti, Maureen Obara, Bibiana Costa, Henning Jacobsen, Maeva Katzmarzyk, Luka Cicin-Sain, Ulrich Kalinke, Marco Galardini
{"title":"Real-time identification of epistatic interactions in SARS-CoV-2 from large genome collections","authors":"Gabriel Innocenti, Maureen Obara, Bibiana Costa, Henning Jacobsen, Maeva Katzmarzyk, Luka Cicin-Sain, Ulrich Kalinke, Marco Galardini","doi":"10.1186/s13059-024-03355-y","DOIUrl":"https://doi.org/10.1186/s13059-024-03355-y","url":null,"abstract":"The emergence of the SARS-CoV-2 virus has highlighted the importance of genomic epidemiology in understanding the evolution of pathogens and guiding public health interventions. The Omicron variant in particular has underscored the role of epistasis in the evolution of lineages with both higher infectivity and immune escape, and therefore the necessity to update surveillance pipelines to detect them early on. In this study, we apply a method based on mutual information between positions in a multiple sequence alignment, which is capable of scaling up to millions of samples. We show how it can reliably predict known experimentally validated epistatic interactions, even when using as little as 10,000 sequences, which opens the possibility of making it a near real-time prediction system. We test this possibility by modifying the method to account for the sample collection date and apply it retrospectively to multiple sequence alignments for each month between March 2020 and March 2023. We detected a cornerstone epistatic interaction in the Spike protein between codons 498 and 501 as soon as seven samples with a double mutation were present in the dataset, thus demonstrating the method’s sensitivity. We test the ability of the method to make inferences about emerging interactions by testing candidates predicted after March 2023, which we validate experimentally. We show how known epistatic interaction in SARS-CoV-2 can be detected with high sensitivity, and how emerging ones can be quickly prioritized for experimental validation, an approach that could be implemented downstream of pandemic genome sequencing efforts.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"66 1","pages":""},"PeriodicalIF":12.3,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142022189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Current limitations in predicting mRNA translation with deep learning models.","authors":"Niels Schlusser, Asier González, Muskan Pandey, Mihaela Zavolan","doi":"10.1186/s13059-024-03369-6","DOIUrl":"10.1186/s13059-024-03369-6","url":null,"abstract":"<p><strong>Background: </strong>The design of nucleotide sequences with defined properties is a long-standing problem in bioengineering. An important application is protein expression, be it in the context of research or the production of mRNA vaccines. The rate of protein synthesis depends on the 5' untranslated region (5'UTR) of the mRNAs, and recently, deep learning models were proposed to predict the translation output of mRNAs from the 5'UTR sequence. At the same time, large data sets of endogenous and reporter mRNA translation have become available.</p><p><strong>Results: </strong>In this study, we use complementary data obtained in two different cell types to assess the accuracy and generality of currently available models for predicting translational output. We find that while performing well on the data sets on which they were trained, deep learning models do not generalize well to other data sets, in particular of endogenous mRNAs, which differ in many properties from reporter constructs.</p><p><strong>Conclusions: </strong>These differences limit the ability of deep learning models to uncover mechanisms of translation control and to predict the impact of genetic variation. We suggest directions that combine high-throughput measurements and machine learning to unravel mechanisms of translation control and improve construct design.</p>","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"25 1","pages":"227"},"PeriodicalIF":10.1,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11337900/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142008589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}