Serhii Reznichenko MS , John Whitaker MD, PhD , Zixuan Ni PhD , Shijie Zhou PhD
{"title":"比较心律失常导联亚群/心电图模式分类:卷积神经网络和随机森林","authors":"Serhii Reznichenko MS , John Whitaker MD, PhD , Zixuan Ni PhD , Shijie Zhou PhD","doi":"10.1016/j.cjco.2024.10.012","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Despite the growth in popularity of deep learning (DL), limited research has compared the performance of DL and conventional machine learning (CML) methods in heart arrhythmia/electrocardiography (ECG) pattern classification. In addition, the classification of heart arrhythmias/ECG patterns is often dependent on specific ECG leads for accurate classification, and it remains unknown how DL and CML methods perform on reduced subsets of ECG leads. In this study, we sought to assess the accuracy of convolutional neural network (CNN) and random forest (RF) models for classifying arrhythmias/ECG patterns using reduced ECG lead subsets representing DL and CML methods.</div></div><div><h3>Methods</h3><div>We used a public data set from the PhysioNet Cardiology Challenge 2020. For the DL method, we trained a CNN classifier extracting features for each ECG lead, which were then used in a feedforward neural network. We used a random forest classifier with manually extracted features for the CML method. Optimal ECG lead subsets were identified by means of recursive feature elimination for both methods.</div></div><div><h3>Results</h3><div>The CML method required 19% more leads (equating to ∼ 2 leads) compared with the DL method. Four common leads (I, II, V5, V6) were identified in each of the subsets of ECG leads using the CML method, and no common leads were consistently present for the DL method. The average macro F1 scores were 0.761 for the DL and 0.759 for the CML.</div></div><div><h3>Conclusions</h3><div>Optimal ECG lead subsets provide classification accuracy similar to that using all 12 leads across DL and CML methods. The DL method achieved slightly higher classification accuracy on larger data sets and required fewer ECG leads compared with the CML method.</div></div>","PeriodicalId":36924,"journal":{"name":"CJC Open","volume":"7 2","pages":"Pages 176-186"},"PeriodicalIF":2.5000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing ECG Lead Subsets for Heart Arrhythmia/ECG Pattern Classification: Convolutional Neural Networks and Random Forest\",\"authors\":\"Serhii Reznichenko MS , John Whitaker MD, PhD , Zixuan Ni PhD , Shijie Zhou PhD\",\"doi\":\"10.1016/j.cjco.2024.10.012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Despite the growth in popularity of deep learning (DL), limited research has compared the performance of DL and conventional machine learning (CML) methods in heart arrhythmia/electrocardiography (ECG) pattern classification. In addition, the classification of heart arrhythmias/ECG patterns is often dependent on specific ECG leads for accurate classification, and it remains unknown how DL and CML methods perform on reduced subsets of ECG leads. In this study, we sought to assess the accuracy of convolutional neural network (CNN) and random forest (RF) models for classifying arrhythmias/ECG patterns using reduced ECG lead subsets representing DL and CML methods.</div></div><div><h3>Methods</h3><div>We used a public data set from the PhysioNet Cardiology Challenge 2020. For the DL method, we trained a CNN classifier extracting features for each ECG lead, which were then used in a feedforward neural network. We used a random forest classifier with manually extracted features for the CML method. Optimal ECG lead subsets were identified by means of recursive feature elimination for both methods.</div></div><div><h3>Results</h3><div>The CML method required 19% more leads (equating to ∼ 2 leads) compared with the DL method. Four common leads (I, II, V5, V6) were identified in each of the subsets of ECG leads using the CML method, and no common leads were consistently present for the DL method. The average macro F1 scores were 0.761 for the DL and 0.759 for the CML.</div></div><div><h3>Conclusions</h3><div>Optimal ECG lead subsets provide classification accuracy similar to that using all 12 leads across DL and CML methods. The DL method achieved slightly higher classification accuracy on larger data sets and required fewer ECG leads compared with the CML method.</div></div>\",\"PeriodicalId\":36924,\"journal\":{\"name\":\"CJC Open\",\"volume\":\"7 2\",\"pages\":\"Pages 176-186\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CJC Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2589790X24005213\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CJC Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589790X24005213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
尽管深度学习(DL)越来越受欢迎,但有限的研究比较了深度学习和传统机器学习(CML)方法在心律失常/心电图(ECG)模式分类中的性能。此外,心律失常/ECG模式的分类通常依赖于特定的ECG导联进行准确分类,目前尚不清楚DL和CML方法在ECG导联的减少子集上的表现。在这项研究中,我们试图评估卷积神经网络(CNN)和随机森林(RF)模型对心律失常/ECG模式进行分类的准确性,使用减少的ECG导联子集代表DL和CML方法。方法:我们使用来自PhysioNet Cardiology Challenge 2020的公共数据集。对于深度学习方法,我们训练了一个CNN分类器,提取每个ECG导联的特征,然后将其用于前馈神经网络。对于CML方法,我们使用了一个手动提取特征的随机森林分类器。两种方法均采用递归特征消去方法识别出最优导联子集。结果与DL方法相比,CML方法需要的导联多19%(相当于~ 2导联)。使用CML方法在每个ECG导联子集中确定了四个共同导联(I, II, V5, V6),并且DL方法中没有一致存在的共同导联。DL和CML的宏观F1平均得分分别为0.761和0.759。结论:最佳心电图导联亚组的分类精度与使用DL和CML方法中所有12条导联的分类精度相似。与CML方法相比,DL方法在更大的数据集上获得了略高的分类精度,并且需要更少的ECG导联。
Comparing ECG Lead Subsets for Heart Arrhythmia/ECG Pattern Classification: Convolutional Neural Networks and Random Forest
Background
Despite the growth in popularity of deep learning (DL), limited research has compared the performance of DL and conventional machine learning (CML) methods in heart arrhythmia/electrocardiography (ECG) pattern classification. In addition, the classification of heart arrhythmias/ECG patterns is often dependent on specific ECG leads for accurate classification, and it remains unknown how DL and CML methods perform on reduced subsets of ECG leads. In this study, we sought to assess the accuracy of convolutional neural network (CNN) and random forest (RF) models for classifying arrhythmias/ECG patterns using reduced ECG lead subsets representing DL and CML methods.
Methods
We used a public data set from the PhysioNet Cardiology Challenge 2020. For the DL method, we trained a CNN classifier extracting features for each ECG lead, which were then used in a feedforward neural network. We used a random forest classifier with manually extracted features for the CML method. Optimal ECG lead subsets were identified by means of recursive feature elimination for both methods.
Results
The CML method required 19% more leads (equating to ∼ 2 leads) compared with the DL method. Four common leads (I, II, V5, V6) were identified in each of the subsets of ECG leads using the CML method, and no common leads were consistently present for the DL method. The average macro F1 scores were 0.761 for the DL and 0.759 for the CML.
Conclusions
Optimal ECG lead subsets provide classification accuracy similar to that using all 12 leads across DL and CML methods. The DL method achieved slightly higher classification accuracy on larger data sets and required fewer ECG leads compared with the CML method.