Identification of Autoantigen Markers for SARS CoV-2 Infection with Machine Learning-based Feature Selection: An Insight into COVID Symptoms

Aruna Rajalingam, Chaitra Mallasandra Krishnappa, Shanker G, Anjali Ganjiwale
{"title":"Identification of Autoantigen Markers for SARS CoV-2 Infection with\nMachine Learning-based Feature Selection: An Insight into COVID\nSymptoms","authors":"Aruna Rajalingam, Chaitra Mallasandra Krishnappa, Shanker G, Anjali Ganjiwale","doi":"10.2174/0126667975296293240320041641","DOIUrl":null,"url":null,"abstract":"\n\nSevere acute respiratory syndrome coronavirus 2 (SARS\nCoV-2) infection has been shown to trigger autoimmunity, and the phenomenon leads to several\nchronic human diseases such as Type-1 diabetes, Crohn’s disease, vasculitis, Guillian-Barrė syndrome,\netc. The mechanism underlying SARS CoV-2-induced autoimmune response is unknown and\nis an active area of interest for the researchers.\n\n\n\nThe primary objective of this study is to identify the autoantigen markers for the classification\nof SARS CoV-2 (COVID-19 positive and negative samples) that trigger an immune response\nleading to autoimmunity using a machine learning approach that provides information to obtain a\nmore accurate diagnosis for COVID-induced diseases.\n\n\n\nOur study reports the transcriptomic profile of the COVID patient's whole\nblood samples collected from 0 to 35th day of acute infection as described in the GSE215865 dataset.\nThe binary classification algorithm from the sci-kit learn python library, namely logistic regression\nand random forest with 10-fold cross-validation, was applied to the processed data, followed by a\nselection of the 20 best gene features with recursive feature elimination from a set of 10,719 gene\nfeatures to obtain the classification accuracy of 87%.\n\n\n\nThe fidgetin, microtubule severing factor (FIGN), SH3 and cysteine-rich domain (STAC),\nCadherin-6 (CDH6), docking protein 6 (DOK6), nuclear RNA export factor 3 (NXF3) and maternally\nexpressed 3 (MEG3) are the autoantigens markers identified for classification of COVID-positive\nand negative samples.\n\n\n\nThe identified autoantigen markers from transcriptomic datasets using machine learning\ntechniques provide a deeper understanding of COVID-induced diseases and may play an important\nrole as potential diagnostic and drug targets for COVID-19.\n","PeriodicalId":10815,"journal":{"name":"Coronaviruses","volume":" March","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Coronaviruses","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0126667975296293240320041641","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS CoV-2) infection has been shown to trigger autoimmunity, and the phenomenon leads to several chronic human diseases such as Type-1 diabetes, Crohn’s disease, vasculitis, Guillian-Barrė syndrome, etc. The mechanism underlying SARS CoV-2-induced autoimmune response is unknown and is an active area of interest for the researchers. The primary objective of this study is to identify the autoantigen markers for the classification of SARS CoV-2 (COVID-19 positive and negative samples) that trigger an immune response leading to autoimmunity using a machine learning approach that provides information to obtain a more accurate diagnosis for COVID-induced diseases. Our study reports the transcriptomic profile of the COVID patient's whole blood samples collected from 0 to 35th day of acute infection as described in the GSE215865 dataset. The binary classification algorithm from the sci-kit learn python library, namely logistic regression and random forest with 10-fold cross-validation, was applied to the processed data, followed by a selection of the 20 best gene features with recursive feature elimination from a set of 10,719 gene features to obtain the classification accuracy of 87%. The fidgetin, microtubule severing factor (FIGN), SH3 and cysteine-rich domain (STAC), Cadherin-6 (CDH6), docking protein 6 (DOK6), nuclear RNA export factor 3 (NXF3) and maternally expressed 3 (MEG3) are the autoantigens markers identified for classification of COVID-positive and negative samples. The identified autoantigen markers from transcriptomic datasets using machine learning techniques provide a deeper understanding of COVID-induced diseases and may play an important role as potential diagnostic and drug targets for COVID-19.
利用基于机器学习的特征选择识别 SARS CoV-2 感染的自身抗原标记:洞察COVIDS症状
严重急性呼吸系统综合征冠状病毒2(SARSCoV-2)感染可诱发自身免疫,并导致多种人类慢性疾病,如1型糖尿病、克罗恩病、血管炎、吉利安-巴氏综合征等。本研究的主要目的是利用机器学习方法识别SARS CoV-2(COVID-19阳性和阴性样本)中引发免疫反应导致自身免疫的自身抗原标记物,为COVID诱发疾病的更准确诊断提供信息。我们的研究报告了在 GSE215865 数据集中收集的 COVID 患者急性感染第 0 天至第 35 天全血样本的转录组图谱。对处理后的数据采用了 sci-kit learn python 库中的二元分类算法,即逻辑回归和随机森林,并进行了 10 倍交叉验证,然后从 10,719 个基因特征集合中用递归特征消除法选出了 20 个最佳基因特征,从而获得了 87% 的分类准确率。Fidgetin、微管切断因子(FIGN)、SH3和富含半胱氨酸结构域(STAC)、Cadherin-6(CDH6)、对接蛋白6(DOK6)、核RNA导出因子3(NXF3)和母体表达3(MEG3)是用于COVID阳性和阴性样本分类的自身抗原标记。利用机器学习技术从转录组数据集中鉴定出的自身抗原标志物加深了人们对COVID诱发疾病的理解,并可能作为COVID-19的潜在诊断和药物靶点发挥重要作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.50
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信