Predictions of COVID-19 Infection Severity Based on Co-associations between the SNPs of Co-morbid Diseases and COVID-19 through Machine Learning of Genetic Data

R-Y Wang, Tim Qinsong Guo, L. Li, Julia Yutian Jiao, Lena Yiqi Wang
{"title":"Predictions of COVID-19 Infection Severity Based on Co-associations between the SNPs of Co-morbid Diseases and COVID-19 through Machine Learning of Genetic Data","authors":"R-Y Wang, Tim Qinsong Guo, L. Li, Julia Yutian Jiao, Lena Yiqi Wang","doi":"10.1109/ICCSNT50940.2020.9304990","DOIUrl":null,"url":null,"abstract":"In this research, a quantitative model is built to predict people's susceptibility to COVID-19 based on their genomes. Identifying people vulnerable to COVID-19 infections is crucial in stopping the spread of the virus. In previous studies, researchers have found that individuals with comorbid diseases have higher chances of being infected and developing more severe COVID-19 conditions. However, these patterns are only observed through correlational analyses between patient phenotypes and the severity of their COVID-19 infection. In this study, genetic variants underlying the observed comorbidity patterns are analyzed through machine learning of COVID-19 data from GWAS studies, which may reveal biological pathways underlying COVID-19 contraction that are essential to the development of effective and targeted therapeutics. Furthermore, through combining genetic variants with the individual's phenotypes, this study built a Neural Network model and Random Forest classifier to predict an individual's likelihood of COVID-19 infection. The Random Forest Classifier in this study shows that on-going symptoms are generally better predictors of COVID-19 condition (higher impurity-based feature importance) than diseases or medical histories. In addition, when trained with genomic data, the comorbid disease impact ranking deduced by the resulting RF model is highly consistent with phenotypic comorbidity patterns observed in past studies.","PeriodicalId":6794,"journal":{"name":"2020 IEEE 8th International Conference on Computer Science and Network Technology (ICCSNT)","volume":"15 3 1","pages":"92-96"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 8th International Conference on Computer Science and Network Technology (ICCSNT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSNT50940.2020.9304990","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

In this research, a quantitative model is built to predict people's susceptibility to COVID-19 based on their genomes. Identifying people vulnerable to COVID-19 infections is crucial in stopping the spread of the virus. In previous studies, researchers have found that individuals with comorbid diseases have higher chances of being infected and developing more severe COVID-19 conditions. However, these patterns are only observed through correlational analyses between patient phenotypes and the severity of their COVID-19 infection. In this study, genetic variants underlying the observed comorbidity patterns are analyzed through machine learning of COVID-19 data from GWAS studies, which may reveal biological pathways underlying COVID-19 contraction that are essential to the development of effective and targeted therapeutics. Furthermore, through combining genetic variants with the individual's phenotypes, this study built a Neural Network model and Random Forest classifier to predict an individual's likelihood of COVID-19 infection. The Random Forest Classifier in this study shows that on-going symptoms are generally better predictors of COVID-19 condition (higher impurity-based feature importance) than diseases or medical histories. In addition, when trained with genomic data, the comorbid disease impact ranking deduced by the resulting RF model is highly consistent with phenotypic comorbidity patterns observed in past studies.
通过遗传数据的机器学习,基于共发病疾病和COVID-19 snp之间的共同关联预测COVID-19感染严重程度
本研究建立了一个基于基因组的定量模型来预测人们对COVID-19的易感性。确定易受COVID-19感染的人群对于阻止该病毒的传播至关重要。在之前的研究中,研究人员发现,患有合并症的人被感染并发展成更严重的COVID-19疾病的可能性更高。然而,这些模式只有通过患者表型与COVID-19感染严重程度之间的相关性分析才能观察到。在这项研究中,通过对来自GWAS研究的COVID-19数据的机器学习,分析了观察到的共病模式的遗传变异,这可能揭示了COVID-19收缩的生物学途径,这对开发有效和靶向治疗方法至关重要。此外,本研究通过将遗传变异与个体表型相结合,构建神经网络模型和随机森林分类器来预测个体感染COVID-19的可能性。本研究中的随机森林分类器表明,持续症状通常比疾病或病史更能预测COVID-19的病情(基于杂质的特征重要性更高)。此外,当使用基因组数据训练时,由所得RF模型推断出的共病疾病影响排名与过去研究中观察到的表型共病模式高度一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信