Highly Accurate Protein Structure Classification and Prediction

Anirban Saha, Indranil Sarkar
{"title":"Highly Accurate Protein Structure Classification and Prediction","authors":"Anirban Saha, Indranil Sarkar","doi":"10.1109/ICCSC56913.2023.10142975","DOIUrl":null,"url":null,"abstract":"Proteins are the main building blocks for any form of life known to us as of now, and it is the actuators of biophysical and chemical events occurring in living organisms. Biological functions are enabled by their naive structure, which plays a very important and crucial role in the design of vaccines and drugs. This acts as one of the main sources of motivation in predicting protein structure from its sequence of amino acids coupled with other information to get highly accurate prediction and classification, which indeed is one of the fundamental computational biology problems. As of now, not much focus has been given to the inclusion of sidechain structure information and prediction of the protein backbone. In this paper, it is shown that a new dataset called SidechainNet, which extends from the ProteinNet dataset, can be used to predict and classify the structure of proteins more accurately. This is because SidechainNet consists of angle and atomic coordinate information, which describes almost all the heavy atoms of each and every protein structure. The background information on the availability of data on the protein structure and the importance of ProteinNet is discussed. It is followed by the beneficial inclusion of additional information that SidechainNet has, which helps in predicting the structure of the protein more accurately. At last, it is shown how using a Machine Learning model, a highly accurate protein structure is obtained by applying SidechainNet as its dataset.","PeriodicalId":184366,"journal":{"name":"2023 2nd International Conference on Computational Systems and Communication (ICCSC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd International Conference on Computational Systems and Communication (ICCSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSC56913.2023.10142975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Proteins are the main building blocks for any form of life known to us as of now, and it is the actuators of biophysical and chemical events occurring in living organisms. Biological functions are enabled by their naive structure, which plays a very important and crucial role in the design of vaccines and drugs. This acts as one of the main sources of motivation in predicting protein structure from its sequence of amino acids coupled with other information to get highly accurate prediction and classification, which indeed is one of the fundamental computational biology problems. As of now, not much focus has been given to the inclusion of sidechain structure information and prediction of the protein backbone. In this paper, it is shown that a new dataset called SidechainNet, which extends from the ProteinNet dataset, can be used to predict and classify the structure of proteins more accurately. This is because SidechainNet consists of angle and atomic coordinate information, which describes almost all the heavy atoms of each and every protein structure. The background information on the availability of data on the protein structure and the importance of ProteinNet is discussed. It is followed by the beneficial inclusion of additional information that SidechainNet has, which helps in predicting the structure of the protein more accurately. At last, it is shown how using a Machine Learning model, a highly accurate protein structure is obtained by applying SidechainNet as its dataset.
高精度的蛋白质结构分类和预测
到目前为止,蛋白质是我们已知的任何生命形式的主要组成部分,它是生物有机体中发生的生物物理和化学事件的驱动器。生物功能是由它们的原始结构实现的,这在疫苗和药物的设计中起着非常重要和关键的作用。这是利用氨基酸序列与其他信息相结合来预测蛋白质结构以获得高度精确的预测和分类的主要动力之一,这确实是计算生物学的基本问题之一。到目前为止,对侧链结构信息的包含和蛋白质骨架的预测还没有太多的关注。本文表明,从ProteinNet数据集扩展而来的新数据集SidechainNet可以更准确地预测和分类蛋白质的结构。这是因为SidechainNet由角度和原子坐标信息组成,这些信息几乎描述了每种蛋白质结构的所有重原子。讨论了蛋白质结构数据可用性的背景信息和ProteinNet的重要性。其次是SidechainNet所拥有的有益的附加信息,这有助于更准确地预测蛋白质的结构。最后,展示了如何使用机器学习模型,以SidechainNet为数据集,获得高精度的蛋白质结构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信