基于癌症基因表达和基于序列的蛋白质相互作用的集成机器学习的癌症疾病分类的有效框架。

Prabhuraj Metipatil, P. Bhuvaneshwari, S. M. Basha, S. Patil
{"title":"基于癌症基因表达和基于序列的蛋白质相互作用的集成机器学习的癌症疾病分类的有效框架。","authors":"Prabhuraj Metipatil, P. Bhuvaneshwari, S. M. Basha, S. Patil","doi":"10.1109/INOCON57975.2023.10101354","DOIUrl":null,"url":null,"abstract":"In recent years, a significant number of deaths worldwide have been due to cancer diseases. Analysis of Microarray gene expressions and protein interaction data facilitates early cancer identification. The accurate prediction of information for thousands of genes is made possible by using DNA microarray technology. Protein-Protein Interactions (PPIs) are the crucial protein activities involved in the cell cycle that replicates the DNA and cellular signaling. Determining whether a pair of proteins interacts is crucial for diagnosing an illness in molecular biology is therefore important. In existing machine learning classifiers have two-class problem that is limited and only be used to solve binary class problems, additionally, they can be prone to overfitting, as the classification framework may also become too specialized to the training data and not generalized to the varied data. To overcome this problem, this paper proposes an ensemble machine learning technique; ensembling combines the strengths of both classifiers that allow more robust and accurate framework. The better combination of both Support Vector machine and Naïve Bayes ensemble provides better performance in terms of various performance parameters. The proposed SVM-NB Ensemble classifier outperforms the existing classifiers by 15-20% over various performance parameters like classification accuracy, time taken for classification, precision, recall, and F-measure. The results were drawn by comparing the proposed ensemble (SVM+NB) classifier with the existing most applied classifiers like Logistic Regression (LR), Support Vector Machine and Naive Bayes techniques.","PeriodicalId":113637,"journal":{"name":"2023 2nd International Conference for Innovation in Technology (INOCON)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Efficient Framework for classifying Cancer diseases using Ensemble machine learning over Cancer Gene Expression and Sequence Based Protein Interactions.\",\"authors\":\"Prabhuraj Metipatil, P. Bhuvaneshwari, S. M. Basha, S. Patil\",\"doi\":\"10.1109/INOCON57975.2023.10101354\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, a significant number of deaths worldwide have been due to cancer diseases. Analysis of Microarray gene expressions and protein interaction data facilitates early cancer identification. The accurate prediction of information for thousands of genes is made possible by using DNA microarray technology. Protein-Protein Interactions (PPIs) are the crucial protein activities involved in the cell cycle that replicates the DNA and cellular signaling. Determining whether a pair of proteins interacts is crucial for diagnosing an illness in molecular biology is therefore important. In existing machine learning classifiers have two-class problem that is limited and only be used to solve binary class problems, additionally, they can be prone to overfitting, as the classification framework may also become too specialized to the training data and not generalized to the varied data. To overcome this problem, this paper proposes an ensemble machine learning technique; ensembling combines the strengths of both classifiers that allow more robust and accurate framework. The better combination of both Support Vector machine and Naïve Bayes ensemble provides better performance in terms of various performance parameters. The proposed SVM-NB Ensemble classifier outperforms the existing classifiers by 15-20% over various performance parameters like classification accuracy, time taken for classification, precision, recall, and F-measure. The results were drawn by comparing the proposed ensemble (SVM+NB) classifier with the existing most applied classifiers like Logistic Regression (LR), Support Vector Machine and Naive Bayes techniques.\",\"PeriodicalId\":113637,\"journal\":{\"name\":\"2023 2nd International Conference for Innovation in Technology (INOCON)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 2nd International Conference for Innovation in Technology (INOCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INOCON57975.2023.10101354\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd International Conference for Innovation in Technology (INOCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INOCON57975.2023.10101354","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

近年来,全世界有相当多的人死于癌症疾病。微阵列基因表达和蛋白质相互作用数据的分析有助于早期癌症的识别。通过使用DNA微阵列技术,对数千个基因信息的准确预测成为可能。蛋白质-蛋白质相互作用(PPIs)是参与细胞周期复制DNA和细胞信号传导的关键蛋白质活动。因此,在分子生物学中,确定一对蛋白质是否相互作用对于诊断疾病至关重要。在现有的机器学习分类器中,分类器存在两类问题,这是有限的,只能用于解决二类问题,此外,它们很容易出现过拟合,因为分类框架也可能对训练数据过于专门化,而不能泛化到不同的数据。为了克服这一问题,本文提出了一种集成机器学习技术;集成结合了两种分类器的优势,允许更健壮和准确的框架。支持向量机与Naïve贝叶斯集成的更好结合,在各种性能参数方面提供了更好的性能。提出的SVM-NB集成分类器在分类精度、分类时间、精度、召回率和F-measure等各种性能参数上优于现有分类器15-20%。通过将所提出的集成(SVM+NB)分类器与现有最常用的分类器如Logistic回归(LR)、支持向量机(Support Vector Machine)和朴素贝叶斯(Naive Bayes)技术进行比较,得出了结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Efficient Framework for classifying Cancer diseases using Ensemble machine learning over Cancer Gene Expression and Sequence Based Protein Interactions.
In recent years, a significant number of deaths worldwide have been due to cancer diseases. Analysis of Microarray gene expressions and protein interaction data facilitates early cancer identification. The accurate prediction of information for thousands of genes is made possible by using DNA microarray technology. Protein-Protein Interactions (PPIs) are the crucial protein activities involved in the cell cycle that replicates the DNA and cellular signaling. Determining whether a pair of proteins interacts is crucial for diagnosing an illness in molecular biology is therefore important. In existing machine learning classifiers have two-class problem that is limited and only be used to solve binary class problems, additionally, they can be prone to overfitting, as the classification framework may also become too specialized to the training data and not generalized to the varied data. To overcome this problem, this paper proposes an ensemble machine learning technique; ensembling combines the strengths of both classifiers that allow more robust and accurate framework. The better combination of both Support Vector machine and Naïve Bayes ensemble provides better performance in terms of various performance parameters. The proposed SVM-NB Ensemble classifier outperforms the existing classifiers by 15-20% over various performance parameters like classification accuracy, time taken for classification, precision, recall, and F-measure. The results were drawn by comparing the proposed ensemble (SVM+NB) classifier with the existing most applied classifiers like Logistic Regression (LR), Support Vector Machine and Naive Bayes techniques.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信