Multinomial Logistic Regression with Adaptive Regularization for Cancer Subtype Classification via Multi-omics Data

IF 2.4 3区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS
Yingdi Wu, Fuzhen Cao, Juntao Li
{"title":"Multinomial Logistic Regression with Adaptive Regularization for Cancer Subtype Classification via Multi-omics Data","authors":"Yingdi Wu, Fuzhen Cao, Juntao Li","doi":"10.2174/0115748936308171240605075531","DOIUrl":null,"url":null,"abstract":"Background: Integrating multi-omics data for cancer classification brings complementary biological insights while also facing challenges such as data integration, gene grouping, and adaptive weight construction. Objective: This paper aims to address the challenges faced by the cancer subtype classification and gene screening based on multi-omics data. Methods: Multinomial logistic regression with adaptive regularization (MLRAR) was proposed by integrating DNA methylation, gene mutation, and RNA-seq information. A data preprocessing strategy that effectively utilizes multi-omics information was presented, and the local maximum quasiclique merging (lmQCM) algorithm was implemented to group genes. Biological pathway information was utilized to evaluate the significance of gene groups, while the significance of each gene within a group was evaluated by integrating mutation information, information theory, and methylation information. Results: Compared to MRlasso, MRGL, MSGL, MROGL, AMRSOGL, and AGLRMR, the proposed method yielded improvements in subtype classification accuracy of breast cancer by 2.6%, 2.9%, 3.5%, 2.3%, 2.0%, and 1.8%, respectively. In addition, MLRAR also achieved significant improvements in ovarian cancer by 8.2%, 5.0%, 6.8%, 5.2%, 12.7%, and 6.3%, respectively. Conclusion: The proposed method can effectively deal with data integration, gene grouping, and adaptive weight construction.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":"2016 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.2174/0115748936308171240605075531","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Integrating multi-omics data for cancer classification brings complementary biological insights while also facing challenges such as data integration, gene grouping, and adaptive weight construction. Objective: This paper aims to address the challenges faced by the cancer subtype classification and gene screening based on multi-omics data. Methods: Multinomial logistic regression with adaptive regularization (MLRAR) was proposed by integrating DNA methylation, gene mutation, and RNA-seq information. A data preprocessing strategy that effectively utilizes multi-omics information was presented, and the local maximum quasiclique merging (lmQCM) algorithm was implemented to group genes. Biological pathway information was utilized to evaluate the significance of gene groups, while the significance of each gene within a group was evaluated by integrating mutation information, information theory, and methylation information. Results: Compared to MRlasso, MRGL, MSGL, MROGL, AMRSOGL, and AGLRMR, the proposed method yielded improvements in subtype classification accuracy of breast cancer by 2.6%, 2.9%, 3.5%, 2.3%, 2.0%, and 1.8%, respectively. In addition, MLRAR also achieved significant improvements in ovarian cancer by 8.2%, 5.0%, 6.8%, 5.2%, 12.7%, and 6.3%, respectively. Conclusion: The proposed method can effectively deal with data integration, gene grouping, and adaptive weight construction.
利用自适应正则化的多项式逻辑回归,通过多组学数据进行癌症亚型分类
背景:整合多组学数据用于癌症分类可带来互补的生物学见解,但同时也面临着数据整合、基因分组和自适应权重构建等挑战。目的:本文旨在解决癌症亚型分类和基因筛选所面临的挑战:本文旨在解决基于多组学数据的癌症亚型分类和基因筛选所面临的挑战。研究方法通过整合 DNA 甲基化、基因突变和 RNA-seq 信息,提出了自适应正则化多叉逻辑回归(MLRAR)。提出了一种有效利用多组学信息的数据预处理策略,并采用局部最大准斜率合并(lmQCM)算法对基因进行分组。利用生物通路信息来评估基因组的重要性,同时通过整合突变信息、信息论和甲基化信息来评估组内每个基因的重要性。结果:与 MRlasso、MRGL、MSGL、MROGL、AMRSOGL 和 AGLRMR 相比,所提出的方法在乳腺癌亚型分类准确率方面分别提高了 2.6%、2.9%、3.5%、2.3%、2.0% 和 1.8%。此外,MLRAR 对卵巢癌的分类准确率也有显著提高,分别提高了 8.2%、5.0%、6.8%、5.2%、12.7% 和 6.3%。结论所提出的方法能有效处理数据整合、基因分组和自适应权重构建等问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Current Bioinformatics
Current Bioinformatics 生物-生化研究方法
CiteScore
6.60
自引率
2.50%
发文量
77
审稿时长
>12 weeks
期刊介绍: Current Bioinformatics aims to publish all the latest and outstanding developments in bioinformatics. Each issue contains a series of timely, in-depth/mini-reviews, research papers and guest edited thematic issues written by leaders in the field, covering a wide range of the integration of biology with computer and information science. The journal focuses on advances in computational molecular/structural biology, encompassing areas such as computing in biomedicine and genomics, computational proteomics and systems biology, and metabolic pathway engineering. Developments in these fields have direct implications on key issues related to health care, medicine, genetic disorders, development of agricultural products, renewable energy, environmental protection, etc.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信