Machine Learning Model for Early Detection of COVID-19 by Heart Rhythm Abnormalities

M. Mezhov, V. Kozitsin, I. Katser
{"title":"Machine Learning Model for Early Detection of COVID-19 by Heart Rhythm Abnormalities","authors":"M. Mezhov, V. Kozitsin, I. Katser","doi":"10.23947/2687-1653-2023-23-1-66-75","DOIUrl":null,"url":null,"abstract":"Introduction. Electronic devices capable of collecting individual telemetry data have opened up prospects for preclinical detection of COVID-19 signs. Known solutions involve the analysis of information that is difficult to obtain at the moment. We are talking, specifically, about the blood condition or a PCR test. This significantly limits the possibility of integrating algorithms with wrist gadgets. At the same time, the cardiovascular system as an object of observation is quite informative, the data collection is well developed. The article describes the problem of detecting covid anomalies in rhythm strips. The work aims at creating a mathematical model based on machine learning algorithms to automate the process of detecting covid abnormalities in the heart rhythm. The possibility of integrating the results obtained with fitness bracelets and smart watches is shown.Materials and Methods. The work involved an open technology stack: Python, Scikit-learn, Lightgbm. When assessing the quality of models for binary classification, metric F1 was used. 229 cardiac rhythm strips (сardiointervalographies) of patients with COVID-19 were studied. The presence or absence of signs of an anomaly was determined taking into account the time of the rhythm strip and the intervals between heartbeats. Deviations that could indicate infection were shown graphically. Based on the exploratory analysis results, a list of signs indicating an anomaly was made.Results. As a result of the work done, a mathematical model was obtained that detected heart rate abnormalities specific to COVID-19 with an accuracy of 83 %. The basic features determining the predictive ability of the model were identified and ranked. They included the current value of the interval between heartbeats, the derivatives at the subsequent and previous points of measuring the duration of the heartbeat, the first derivative at the current point, and the deviation of the current value of the duration of the RR-interval from the median. The first indicator in this list was recognized as the most significant, the last — the least. For machine learning purposes, the potential of five algorithms was evaluated: IsolationForest, LGBMClassifier, RandomForestClassifier, ExtraTreesClassifier, SGDOneClassSVM. The normal and abnormal results of observations in isolation trees were visualized. A parameter was set that corresponded to the probability of regular observation outside the norm, and its value was selected — 0.11. Taking into account this indicator, a graph was constructed for the SGDOneClassSVM model. Based on the data set, using the cross-validation technique, the quality metric was calculated. The case in hand was a rhythm strip with a time series of observations taken in one continuous time interval from one person. A step-by-step process of obtaining averaged metric values for each model was described. In comparison, the highest indicator was recorded for the LGBMClassifier model, the lowest — for SGDOneClassSVM and IsolationForest.Discussion and Conclusions. The resulting mathematical model takes up little space in the memory of a mobile device, i.e., it does not impose significant requirements on computing resources. The solution has an acceptable detection quality for preclinical screening of COVID-19-related cardiovascular disorders. The algorithm detects anomalies in 83 % of cases. Four minutes is enough to record a rhythm strip. The proposed scenario for using an integrated solution is concise and easy to implement. Widespread use of the development can contribute to the detection of COVID-19 at an early stage.","PeriodicalId":13758,"journal":{"name":"International Journal of Advanced Engineering Research and Science","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Engineering Research and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23947/2687-1653-2023-23-1-66-75","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction. Electronic devices capable of collecting individual telemetry data have opened up prospects for preclinical detection of COVID-19 signs. Known solutions involve the analysis of information that is difficult to obtain at the moment. We are talking, specifically, about the blood condition or a PCR test. This significantly limits the possibility of integrating algorithms with wrist gadgets. At the same time, the cardiovascular system as an object of observation is quite informative, the data collection is well developed. The article describes the problem of detecting covid anomalies in rhythm strips. The work aims at creating a mathematical model based on machine learning algorithms to automate the process of detecting covid abnormalities in the heart rhythm. The possibility of integrating the results obtained with fitness bracelets and smart watches is shown.Materials and Methods. The work involved an open technology stack: Python, Scikit-learn, Lightgbm. When assessing the quality of models for binary classification, metric F1 was used. 229 cardiac rhythm strips (сardiointervalographies) of patients with COVID-19 were studied. The presence or absence of signs of an anomaly was determined taking into account the time of the rhythm strip and the intervals between heartbeats. Deviations that could indicate infection were shown graphically. Based on the exploratory analysis results, a list of signs indicating an anomaly was made.Results. As a result of the work done, a mathematical model was obtained that detected heart rate abnormalities specific to COVID-19 with an accuracy of 83 %. The basic features determining the predictive ability of the model were identified and ranked. They included the current value of the interval between heartbeats, the derivatives at the subsequent and previous points of measuring the duration of the heartbeat, the first derivative at the current point, and the deviation of the current value of the duration of the RR-interval from the median. The first indicator in this list was recognized as the most significant, the last — the least. For machine learning purposes, the potential of five algorithms was evaluated: IsolationForest, LGBMClassifier, RandomForestClassifier, ExtraTreesClassifier, SGDOneClassSVM. The normal and abnormal results of observations in isolation trees were visualized. A parameter was set that corresponded to the probability of regular observation outside the norm, and its value was selected — 0.11. Taking into account this indicator, a graph was constructed for the SGDOneClassSVM model. Based on the data set, using the cross-validation technique, the quality metric was calculated. The case in hand was a rhythm strip with a time series of observations taken in one continuous time interval from one person. A step-by-step process of obtaining averaged metric values for each model was described. In comparison, the highest indicator was recorded for the LGBMClassifier model, the lowest — for SGDOneClassSVM and IsolationForest.Discussion and Conclusions. The resulting mathematical model takes up little space in the memory of a mobile device, i.e., it does not impose significant requirements on computing resources. The solution has an acceptable detection quality for preclinical screening of COVID-19-related cardiovascular disorders. The algorithm detects anomalies in 83 % of cases. Four minutes is enough to record a rhythm strip. The proposed scenario for using an integrated solution is concise and easy to implement. Widespread use of the development can contribute to the detection of COVID-19 at an early stage.
心律异常早期检测COVID-19的机器学习模型
介绍。能够收集个人遥测数据的电子设备为临床前检测COVID-19体征开辟了前景。已知的解决方案包括对目前难以获得的信息进行分析。具体来说,我们讨论的是血液状况或聚合酶链反应测试。这极大地限制了将算法与手腕设备集成的可能性。同时,心血管系统作为观察对象信息量相当大,数据收集也很发达。本文介绍了在节律条中检测covid异常的问题。该工作旨在创建基于机器学习算法的数学模型,以自动检测心律异常的过程。展示了将结果与健身手环和智能手表相结合的可能性。材料与方法。这项工作涉及到一个开放的技术堆栈:Python、Scikit-learn、Lightgbm。在评估二元分类模型的质量时,使用度量F1。对229例COVID-19患者的心律条(心间期图)进行了研究。存在或不存在异常的迹象是确定考虑到时间的节奏条和心跳之间的间隔。可能表明感染的偏差用图形显示。在探索性分析结果的基础上,提出了异常标志列表。通过这项工作,获得了一个数学模型,该模型可以检测到COVID-19特有的心率异常,准确率为83%。对决定模型预测能力的基本特征进行了识别和排序。它们包括心跳间隔的当前值,其后和前一个测量心跳持续时间的点的导数,当前点的第一个导数,以及rr间隔持续时间的当前值与中位数的偏差。这个列表中的第一个指标被认为是最重要的,最后一个也是最不重要的。出于机器学习的目的,评估了五种算法的潜力:IsolationForest, LGBMClassifier, RandomForestClassifier, ExtraTreesClassifier, SGDOneClassSVM。对隔离树观测的正常和异常结果进行了可视化。设置一个参数,对应于常态外正常观测的概率,其值取- 0.11。考虑到这个指标,我们为SGDOneClassSVM模型构建了一个图。在数据集的基础上,利用交叉验证技术计算质量度量。手头的案例是一个节奏条,上面有在一个连续的时间间隔内从一个人身上观察到的时间序列。描述了逐步获得每个模型的平均度量值的过程。相比之下,LGBMClassifier模型的指标最高,SGDOneClassSVM和IsolationForest的指标最低。讨论和结论。由此产生的数学模型在移动设备的内存中占用很少的空间,也就是说,它不会对计算资源施加很大的要求。该溶液对covid -19相关心血管疾病的临床前筛查具有可接受的检测质量。该算法在83%的情况下检测到异常。四分钟就足够录一个节奏条了。建议的使用集成解决方案的场景简洁且易于实现。广泛使用该制剂可有助于在早期发现COVID-19。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信