基于机器学习的极限梯度增强算法识别动力蛋白运动蛋白的分子功能

A. Ghulam, Rahu Sikander, Dhani Bux Talpur, Erum Saba, Mir Sajjad Hussain Talpur, Z. A. Maher, Saima Tunio
{"title":"基于机器学习的极限梯度增强算法识别动力蛋白运动蛋白的分子功能","authors":"A. Ghulam, Rahu Sikander, Dhani Bux Talpur, Erum Saba, Mir Sajjad Hussain Talpur, Z. A. Maher, Saima Tunio","doi":"10.53874/jmar.v8i0.166","DOIUrl":null,"url":null,"abstract":"The majority of cytoplasmic proteins and vesicles move actively primarily to dynein motor proteins, which are the cause of muscle contraction. Moreover, identifying how dynein are used in cells will rely on structural knowledge. Cytoskeletal motor proteins have different molecular roles and structures, and they belong to three superfamilies of dynamin, actin and myosin. Loss of function of specific molecular motor proteins can be attributed to a number of human diseases, such as Charcot-Charcot-Dystrophy and kidney disease.  It is crucial to create a precise model to identify dynein motor proteins in order to aid scientists in understanding their molecular role and designing therapeutic targets based on their influence on human disease. Therefore, we develop an accurate and efficient computational methodology is highly desired, especially when using cutting-edge machine learning methods. In this article, we proposed a machine learning-based superfamily of cytoskeletal motor protein locations prediction method called extreme gradient boosting (XGBoost). We get the initial feature set All by extraction the protein features from the sequence and evolutionary data of the amino acid residues named BLOUSM62. Through our successful eXtreme gradient boosting (XGBoost), accuracy score 0.8676%, Precision score 0.8768%, Sensitivity score 0.760%, Specificity score 0.9752% and MCC score 0.7536%.  Our method has demonstrated substantial improvements in the performance of many of the evaluation parameters compared to other state-of-the-art methods. This study offers an effective model for the classification of dynein proteins and lays a foundation for further research to improve the efficiency of protein functional classification.","PeriodicalId":31687,"journal":{"name":"Journal of Mountain Area Research","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IDENTIFYING MOLECULAR FUNCTIONS OF DYNEIN MOTOR PROTEINS USING EXTREME GRADIENT BOOSTING ALGORITHM WITH MACHINE LEARNING\",\"authors\":\"A. Ghulam, Rahu Sikander, Dhani Bux Talpur, Erum Saba, Mir Sajjad Hussain Talpur, Z. A. Maher, Saima Tunio\",\"doi\":\"10.53874/jmar.v8i0.166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The majority of cytoplasmic proteins and vesicles move actively primarily to dynein motor proteins, which are the cause of muscle contraction. Moreover, identifying how dynein are used in cells will rely on structural knowledge. Cytoskeletal motor proteins have different molecular roles and structures, and they belong to three superfamilies of dynamin, actin and myosin. Loss of function of specific molecular motor proteins can be attributed to a number of human diseases, such as Charcot-Charcot-Dystrophy and kidney disease.  It is crucial to create a precise model to identify dynein motor proteins in order to aid scientists in understanding their molecular role and designing therapeutic targets based on their influence on human disease. Therefore, we develop an accurate and efficient computational methodology is highly desired, especially when using cutting-edge machine learning methods. In this article, we proposed a machine learning-based superfamily of cytoskeletal motor protein locations prediction method called extreme gradient boosting (XGBoost). We get the initial feature set All by extraction the protein features from the sequence and evolutionary data of the amino acid residues named BLOUSM62. Through our successful eXtreme gradient boosting (XGBoost), accuracy score 0.8676%, Precision score 0.8768%, Sensitivity score 0.760%, Specificity score 0.9752% and MCC score 0.7536%.  Our method has demonstrated substantial improvements in the performance of many of the evaluation parameters compared to other state-of-the-art methods. This study offers an effective model for the classification of dynein proteins and lays a foundation for further research to improve the efficiency of protein functional classification.\",\"PeriodicalId\":31687,\"journal\":{\"name\":\"Journal of Mountain Area Research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Mountain Area Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.53874/jmar.v8i0.166\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Mountain Area Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53874/jmar.v8i0.166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

大多数细胞质蛋白和小泡主要活跃于动力蛋白运动蛋白,这是肌肉收缩的原因。此外,确定动力蛋白如何在细胞中使用将依赖于结构知识。细胞骨架运动蛋白具有不同的分子作用和结构,属于动力蛋白、肌动蛋白和肌球蛋白三个超家族。特定分子运动蛋白功能的丧失可归因于许多人类疾病,如夏氏营养不良和肾脏疾病。创建一个精确的模型来识别动力蛋白运动蛋白,以帮助科学家了解其分子作用,并根据其对人类疾病的影响设计治疗靶点,这一点至关重要。因此,我们迫切需要开发一种准确高效的计算方法,尤其是在使用尖端的机器学习方法时。在本文中,我们提出了一种基于机器学习的细胞骨架运动蛋白超家族位置预测方法,称为极限梯度增强(XGBoost)。我们通过从名为BLOUSM62的氨基酸残基的序列和进化数据中提取蛋白质特征来获得初始特征集All。通过我们成功的eXtreme梯度增强(XGBoost),准确率得分为0.8676%,精密度得分为0.8768%,灵敏度得分为0.760%,特异性得分为0.9752%,MCC得分为0.7536%。与其他最先进的方法相比,我们的方法在许多评估参数的性能上都有了显著改进。该研究为动力蛋白的分类提供了一个有效的模型,为进一步研究提高蛋白质功能分类的效率奠定了基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
IDENTIFYING MOLECULAR FUNCTIONS OF DYNEIN MOTOR PROTEINS USING EXTREME GRADIENT BOOSTING ALGORITHM WITH MACHINE LEARNING
The majority of cytoplasmic proteins and vesicles move actively primarily to dynein motor proteins, which are the cause of muscle contraction. Moreover, identifying how dynein are used in cells will rely on structural knowledge. Cytoskeletal motor proteins have different molecular roles and structures, and they belong to three superfamilies of dynamin, actin and myosin. Loss of function of specific molecular motor proteins can be attributed to a number of human diseases, such as Charcot-Charcot-Dystrophy and kidney disease.  It is crucial to create a precise model to identify dynein motor proteins in order to aid scientists in understanding their molecular role and designing therapeutic targets based on their influence on human disease. Therefore, we develop an accurate and efficient computational methodology is highly desired, especially when using cutting-edge machine learning methods. In this article, we proposed a machine learning-based superfamily of cytoskeletal motor protein locations prediction method called extreme gradient boosting (XGBoost). We get the initial feature set All by extraction the protein features from the sequence and evolutionary data of the amino acid residues named BLOUSM62. Through our successful eXtreme gradient boosting (XGBoost), accuracy score 0.8676%, Precision score 0.8768%, Sensitivity score 0.760%, Specificity score 0.9752% and MCC score 0.7536%.  Our method has demonstrated substantial improvements in the performance of many of the evaluation parameters compared to other state-of-the-art methods. This study offers an effective model for the classification of dynein proteins and lays a foundation for further research to improve the efficiency of protein functional classification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
3
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信