Zhao Ren, N. Cummins, Vedhas Pandit, Jing Han, Kun Qian, Björn Schuller
{"title":"学习基于图像的心音分类表示","authors":"Zhao Ren, N. Cummins, Vedhas Pandit, Jing Han, Kun Qian, Björn Schuller","doi":"10.1145/3194658.3194671","DOIUrl":null,"url":null,"abstract":"Machine learning based heart sound classification represents an efficient technology that can help reduce the burden of manual auscultation through the automatic detection of abnormal heart sounds. In this regard, we investigate the efficacy of using the pre-trained Convolutional Neural Networks (CNNs) from large-scale image data for the classification of Phonocardiogram (PCG) signals by learning deep PCG representations. First, the PCG files are segmented into chunks of equal length. Then, we extract a scalogram image from each chunk using a wavelet transformation. Next, the scalogram images are fed into either a pre-trained CNN, or the same network fine-tuned on heart sound data. Deep representations are then extracted from a fully connected layer of each network and classification is achieved by a static classifier. Alternatively, the scalogram images are fed into an end-to-end CNN formed by adapting a pre-trained network via transfer learning. Key results indicate that our deep PCG representations extracted from a fine-tuned CNN perform the strongest, 56.2% mean accuracy, on our heart sound classification task. When compared to a baseline accuracy of 46.9%, gained using conventional audio processing features and a support vector machine, this is a significant relative improvement of 19.8% (p∠.001 by one-tailed z-test).","PeriodicalId":216658,"journal":{"name":"Proceedings of the 2018 International Conference on Digital Health","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"Learning Image-based Representations for Heart Sound Classification\",\"authors\":\"Zhao Ren, N. Cummins, Vedhas Pandit, Jing Han, Kun Qian, Björn Schuller\",\"doi\":\"10.1145/3194658.3194671\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning based heart sound classification represents an efficient technology that can help reduce the burden of manual auscultation through the automatic detection of abnormal heart sounds. In this regard, we investigate the efficacy of using the pre-trained Convolutional Neural Networks (CNNs) from large-scale image data for the classification of Phonocardiogram (PCG) signals by learning deep PCG representations. First, the PCG files are segmented into chunks of equal length. Then, we extract a scalogram image from each chunk using a wavelet transformation. Next, the scalogram images are fed into either a pre-trained CNN, or the same network fine-tuned on heart sound data. Deep representations are then extracted from a fully connected layer of each network and classification is achieved by a static classifier. Alternatively, the scalogram images are fed into an end-to-end CNN formed by adapting a pre-trained network via transfer learning. Key results indicate that our deep PCG representations extracted from a fine-tuned CNN perform the strongest, 56.2% mean accuracy, on our heart sound classification task. When compared to a baseline accuracy of 46.9%, gained using conventional audio processing features and a support vector machine, this is a significant relative improvement of 19.8% (p∠.001 by one-tailed z-test).\",\"PeriodicalId\":216658,\"journal\":{\"name\":\"Proceedings of the 2018 International Conference on Digital Health\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2018 International Conference on Digital Health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3194658.3194671\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 International Conference on Digital Health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3194658.3194671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning Image-based Representations for Heart Sound Classification
Machine learning based heart sound classification represents an efficient technology that can help reduce the burden of manual auscultation through the automatic detection of abnormal heart sounds. In this regard, we investigate the efficacy of using the pre-trained Convolutional Neural Networks (CNNs) from large-scale image data for the classification of Phonocardiogram (PCG) signals by learning deep PCG representations. First, the PCG files are segmented into chunks of equal length. Then, we extract a scalogram image from each chunk using a wavelet transformation. Next, the scalogram images are fed into either a pre-trained CNN, or the same network fine-tuned on heart sound data. Deep representations are then extracted from a fully connected layer of each network and classification is achieved by a static classifier. Alternatively, the scalogram images are fed into an end-to-end CNN formed by adapting a pre-trained network via transfer learning. Key results indicate that our deep PCG representations extracted from a fine-tuned CNN perform the strongest, 56.2% mean accuracy, on our heart sound classification task. When compared to a baseline accuracy of 46.9%, gained using conventional audio processing features and a support vector machine, this is a significant relative improvement of 19.8% (p∠.001 by one-tailed z-test).