{"title":"tRNA-DL:一种改进tRNAscan-SE预测结果的深度学习方法。","authors":"Xin Gao, Zhi Wei, Hakon Hakonarson","doi":"10.1159/000493215","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>tRNAscan-SE is the leading tool for transfer RNA (tRNA) annotation, which has been widely used in the field. However, tRNAscan-SE can return a significant number of false positives when applied to large sequences. Recently, conventional machine learning methods have been proposed to address this issue, but their efficiency can be still limited due to their dependency on handcrafted features. With the growing availability of large-scale genomic data-sets, deep learning methods, especially convolutional neural networks, have demonstrated excellent power in characterizing sequence patterns in genomic sequences. Thus, we hypothesize that deep learning may bring further improvement for tRNA prediction.</p><p><strong>Methods: </strong>We proposed a new computational approach based on deep neural networks to predict tRNA gene sequences. We designed and investigated various deep neural network architectures. We used the tRNA sequences as positive samples, and the false-positive tRNA sequences predicted by tRNAscan-SE in coding sequences as negative samples, to train and evaluate the proposed models by comparison with the conventional machine learning methods and popular tRNA prediction tools.</p><p><strong>Results: </strong>Using the one-hot encoding method, our proposed models can extract features without involving extensive manual feature engineering. Our proposed best model outperformed the existing methods under different performance metrics.</p><p><strong>Conclusion: </strong>The proposed deep learning methods can substantially reduce the false positive output by the state-of-the-art tool tRNAscan-SE. Coupled with tRNAscan-SE, it can serve as a useful complementary tool for tRNA annotation. The application to tRNA prediction demonstrates the superiority of deep learning in automatic feature generation for characterizing sequence patterns.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"83 3","pages":"163-172"},"PeriodicalIF":1.1000,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1159/000493215","citationCount":"6","resultStr":"{\"title\":\"tRNA-DL: A Deep Learning Approach to Improve tRNAscan-SE Prediction Results.\",\"authors\":\"Xin Gao, Zhi Wei, Hakon Hakonarson\",\"doi\":\"10.1159/000493215\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>tRNAscan-SE is the leading tool for transfer RNA (tRNA) annotation, which has been widely used in the field. However, tRNAscan-SE can return a significant number of false positives when applied to large sequences. Recently, conventional machine learning methods have been proposed to address this issue, but their efficiency can be still limited due to their dependency on handcrafted features. With the growing availability of large-scale genomic data-sets, deep learning methods, especially convolutional neural networks, have demonstrated excellent power in characterizing sequence patterns in genomic sequences. Thus, we hypothesize that deep learning may bring further improvement for tRNA prediction.</p><p><strong>Methods: </strong>We proposed a new computational approach based on deep neural networks to predict tRNA gene sequences. We designed and investigated various deep neural network architectures. We used the tRNA sequences as positive samples, and the false-positive tRNA sequences predicted by tRNAscan-SE in coding sequences as negative samples, to train and evaluate the proposed models by comparison with the conventional machine learning methods and popular tRNA prediction tools.</p><p><strong>Results: </strong>Using the one-hot encoding method, our proposed models can extract features without involving extensive manual feature engineering. Our proposed best model outperformed the existing methods under different performance metrics.</p><p><strong>Conclusion: </strong>The proposed deep learning methods can substantially reduce the false positive output by the state-of-the-art tool tRNAscan-SE. Coupled with tRNAscan-SE, it can serve as a useful complementary tool for tRNA annotation. The application to tRNA prediction demonstrates the superiority of deep learning in automatic feature generation for characterizing sequence patterns.</p>\",\"PeriodicalId\":13226,\"journal\":{\"name\":\"Human Heredity\",\"volume\":\"83 3\",\"pages\":\"163-172\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2018-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1159/000493215\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human Heredity\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1159/000493215\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2019/1/25 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Heredity","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1159/000493215","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/1/25 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
tRNA-DL: A Deep Learning Approach to Improve tRNAscan-SE Prediction Results.
Background: tRNAscan-SE is the leading tool for transfer RNA (tRNA) annotation, which has been widely used in the field. However, tRNAscan-SE can return a significant number of false positives when applied to large sequences. Recently, conventional machine learning methods have been proposed to address this issue, but their efficiency can be still limited due to their dependency on handcrafted features. With the growing availability of large-scale genomic data-sets, deep learning methods, especially convolutional neural networks, have demonstrated excellent power in characterizing sequence patterns in genomic sequences. Thus, we hypothesize that deep learning may bring further improvement for tRNA prediction.
Methods: We proposed a new computational approach based on deep neural networks to predict tRNA gene sequences. We designed and investigated various deep neural network architectures. We used the tRNA sequences as positive samples, and the false-positive tRNA sequences predicted by tRNAscan-SE in coding sequences as negative samples, to train and evaluate the proposed models by comparison with the conventional machine learning methods and popular tRNA prediction tools.
Results: Using the one-hot encoding method, our proposed models can extract features without involving extensive manual feature engineering. Our proposed best model outperformed the existing methods under different performance metrics.
Conclusion: The proposed deep learning methods can substantially reduce the false positive output by the state-of-the-art tool tRNAscan-SE. Coupled with tRNAscan-SE, it can serve as a useful complementary tool for tRNA annotation. The application to tRNA prediction demonstrates the superiority of deep learning in automatic feature generation for characterizing sequence patterns.
期刊介绍:
Gathering original research reports and short communications from all over the world, ''Human Heredity'' is devoted to methodological and applied research on the genetics of human populations, association and linkage analysis, genetic mechanisms of disease, and new methods for statistical genetics, for example, analysis of rare variants and results from next generation sequencing. The value of this information to many branches of medicine is shown by the number of citations the journal receives in fields ranging from immunology and hematology to epidemiology and public health planning, and the fact that at least 50% of all ''Human Heredity'' papers are still cited more than 8 years after publication (according to ISI Journal Citation Reports). Special issues on methodological topics (such as ‘Consanguinity and Genomics’ in 2014; ‘Analyzing Rare Variants in Complex Diseases’ in 2012) or reviews of advances in particular fields (‘Genetic Diversity in European Populations: Evolutionary Evidence and Medical Implications’ in 2014; ‘Genes and the Environment in Obesity’ in 2013) are published every year. Renowned experts in the field are invited to contribute to these special issues.