基于支持向量机和密码子编码方案的蛋白质二级结构预测

Masood Zamani, S. C. Kremer
{"title":"基于支持向量机和密码子编码方案的蛋白质二级结构预测","authors":"Masood Zamani, S. C. Kremer","doi":"10.1109/BIBMW.2012.6470326","DOIUrl":null,"url":null,"abstract":"In this study, we evaluate the performance of a protein secondary structure prediction model using a new amino acid \"codon\" encoding inspired by genetic codon mappings. The dimensionality of the binary codon encoding is less than that of an orthogonal encoding which requires less computations. Protein secondary structure prediction is an important step for machine learning techniques ultimately applied for protein 3D structure prediction. In the proposed model, one-stage binary support vector machines are employed, and the efficiency of the codon encoding to that of a commonly used orthogonal encoding are compared without incorporating protein evolutionary and structural information for an unbiased comparison. The performance of the classification model is measured according to Q3 and segment overlap (SOV) scores. The scores are compared with those of the prediction methods using an orthogonal encoding and protein sequence profiles. The experimental results indicate higher prediction accuracy based on Q3 SOV scores when sequence profiles are not used. Also, the relative classification scores of the proposed method are comparable with the methods incorporating protein global and evolutionary information. The experimental result implies the encoding scheme is able to integrate the evolutionary information into the prediction model since the encoding is based on genetic codon mappings which are the building blocks of amino acid formations at the primary level of biological processes. The codon encoding is worthwhile to be investigated using more complex learning architectures with the profiles and structural properties of proteins.","PeriodicalId":6392,"journal":{"name":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Protein secondary structure prediction using support vector machines and a codon encoding scheme\",\"authors\":\"Masood Zamani, S. C. Kremer\",\"doi\":\"10.1109/BIBMW.2012.6470326\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we evaluate the performance of a protein secondary structure prediction model using a new amino acid \\\"codon\\\" encoding inspired by genetic codon mappings. The dimensionality of the binary codon encoding is less than that of an orthogonal encoding which requires less computations. Protein secondary structure prediction is an important step for machine learning techniques ultimately applied for protein 3D structure prediction. In the proposed model, one-stage binary support vector machines are employed, and the efficiency of the codon encoding to that of a commonly used orthogonal encoding are compared without incorporating protein evolutionary and structural information for an unbiased comparison. The performance of the classification model is measured according to Q3 and segment overlap (SOV) scores. The scores are compared with those of the prediction methods using an orthogonal encoding and protein sequence profiles. The experimental results indicate higher prediction accuracy based on Q3 SOV scores when sequence profiles are not used. Also, the relative classification scores of the proposed method are comparable with the methods incorporating protein global and evolutionary information. The experimental result implies the encoding scheme is able to integrate the evolutionary information into the prediction model since the encoding is based on genetic codon mappings which are the building blocks of amino acid formations at the primary level of biological processes. The codon encoding is worthwhile to be investigated using more complex learning architectures with the profiles and structural properties of proteins.\",\"PeriodicalId\":6392,\"journal\":{\"name\":\"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBMW.2012.6470326\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2012.6470326","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

在这项研究中,我们利用受遗传密码子映射启发的新的氨基酸“密码子”编码来评估蛋白质二级结构预测模型的性能。二进制密码子编码的维数比需要较少计算量的正交编码要少。蛋白质二级结构预测是机器学习技术最终应用于蛋白质三维结构预测的重要一步。该模型采用单阶段二值支持向量机,在不考虑蛋白质进化和结构信息的情况下,将密码子编码效率与常用的正交编码效率进行了比较。根据Q3和部分重叠(SOV)分数来衡量分类模型的性能。并与采用正交编码和蛋白质序列谱的预测方法进行了比较。实验结果表明,当不使用序列剖面时,基于Q3 SOV分数的预测精度更高。此外,该方法的相对分类分数与结合蛋白质全局信息和进化信息的方法具有可比性。实验结果表明,该编码方案基于遗传密码子映射,能够将进化信息整合到预测模型中,而遗传密码子映射是生物过程初级水平氨基酸形成的基石。密码子编码是值得研究的,使用更复杂的学习架构与蛋白质的概况和结构特性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Protein secondary structure prediction using support vector machines and a codon encoding scheme
In this study, we evaluate the performance of a protein secondary structure prediction model using a new amino acid "codon" encoding inspired by genetic codon mappings. The dimensionality of the binary codon encoding is less than that of an orthogonal encoding which requires less computations. Protein secondary structure prediction is an important step for machine learning techniques ultimately applied for protein 3D structure prediction. In the proposed model, one-stage binary support vector machines are employed, and the efficiency of the codon encoding to that of a commonly used orthogonal encoding are compared without incorporating protein evolutionary and structural information for an unbiased comparison. The performance of the classification model is measured according to Q3 and segment overlap (SOV) scores. The scores are compared with those of the prediction methods using an orthogonal encoding and protein sequence profiles. The experimental results indicate higher prediction accuracy based on Q3 SOV scores when sequence profiles are not used. Also, the relative classification scores of the proposed method are comparable with the methods incorporating protein global and evolutionary information. The experimental result implies the encoding scheme is able to integrate the evolutionary information into the prediction model since the encoding is based on genetic codon mappings which are the building blocks of amino acid formations at the primary level of biological processes. The codon encoding is worthwhile to be investigated using more complex learning architectures with the profiles and structural properties of proteins.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信