元音分割对慢性阻塞性肺疾病机器学习分类的影响。

IF 3.9 2区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Scientific Reports Pub Date : 2025-03-22 DOI:10.1038/s41598-025-95320-3

Alper Idrisoglu, Ana Luiza Dallora Moraes, Abbas Cheddad, Peter Anderberg, Andreas Jakobsson, Johan Sanmartin Berglund

{"title":"元音分割对慢性阻塞性肺疾病机器学习分类的影响。","authors":"Alper Idrisoglu, Ana Luiza Dallora Moraes, Abbas Cheddad, Peter Anderberg, Andreas Jakobsson, Johan Sanmartin Berglund","doi":"10.1038/s41598-025-95320-3","DOIUrl":null,"url":null,"abstract":"Vowel-based voice analysis is gaining attention as a potential non-invasive tool for COPD classification, offering insights into phonatory function. The growing need for voice data has necessitated the adoption of various techniques, including segmentation, to augment existing datasets for training comprehensive Machine Learning (ML) modelsThis study aims to investigate the possible effects of segmentation of the utterance of vowel \"a\" on the performance of ML classifiers CatBoost (CB), Random Forest (RF), and Support Vector Machine (SVM). This research involves training individual ML models using three distinct dataset constructions: full-sequence, segment-wise, and group-wise, derived from the utterance of the vowel \"a\" which consists of 1058 recordings belonging to 48 participants. This approach comprehensively analyzes how each data categorization impacts the model's performance and results. A nested cross-validation (nCV) approach was implemented with grid search for hyperparameter optimization. This rigorous methodology was employed to minimize overfitting risks and maximize model performance. Compared to the full-sequence dataset, the findings indicate that the second segment yielded higher results within the four-segment category. Specifically, the CB model achieved superior accuracy, attaining 97.8% and 84.6% on the validation and test sets, respectively. The same category for the CB model also demonstrated the best balance regarding true positive rate (TPR) and true negative rate (TNR), making it the most clinically effective choice. These findings suggest that time-sensitive properties in vowel production are important for COPD classification and that segmentation can aid in capturing these properties. Despite these promising results, the dataset size and demographic homogeneity limit generalizability, highlighting areas for future research.Trial registration The study is registered on clinicaltrials.gov with ID: NCT06160674.","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"9930"},"PeriodicalIF":3.9000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11929820/pdf/","citationCount":"0","resultStr":"{\"title\":\"Vowel segmentation impact on machine learning classification for chronic obstructive pulmonary disease.\",\"authors\":\"Alper Idrisoglu, Ana Luiza Dallora Moraes, Abbas Cheddad, Peter Anderberg, Andreas Jakobsson, Johan Sanmartin Berglund\",\"doi\":\"10.1038/s41598-025-95320-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Vowel-based voice analysis is gaining attention as a potential non-invasive tool for COPD classification, offering insights into phonatory function. The growing need for voice data has necessitated the adoption of various techniques, including segmentation, to augment existing datasets for training comprehensive Machine Learning (ML) modelsThis study aims to investigate the possible effects of segmentation of the utterance of vowel \\\"a\\\" on the performance of ML classifiers CatBoost (CB), Random Forest (RF), and Support Vector Machine (SVM). This research involves training individual ML models using three distinct dataset constructions: full-sequence, segment-wise, and group-wise, derived from the utterance of the vowel \\\"a\\\" which consists of 1058 recordings belonging to 48 participants. This approach comprehensively analyzes how each data categorization impacts the model's performance and results. A nested cross-validation (nCV) approach was implemented with grid search for hyperparameter optimization. This rigorous methodology was employed to minimize overfitting risks and maximize model performance. Compared to the full-sequence dataset, the findings indicate that the second segment yielded higher results within the four-segment category. Specifically, the CB model achieved superior accuracy, attaining 97.8% and 84.6% on the validation and test sets, respectively. The same category for the CB model also demonstrated the best balance regarding true positive rate (TPR) and true negative rate (TNR), making it the most clinically effective choice. These findings suggest that time-sensitive properties in vowel production are important for COPD classification and that segmentation can aid in capturing these properties. Despite these promising results, the dataset size and demographic homogeneity limit generalizability, highlighting areas for future research.Trial registration The study is registered on clinicaltrials.gov with ID: NCT06160674.\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"9930\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11929820/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-025-95320-3\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-95320-3","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

基于元音的语音分析作为COPD分类的潜在非侵入性工具正受到关注，它提供了对发音功能的见解。对语音数据日益增长的需求使得需要采用各种技术，包括分割，来增强现有的数据集，以训练全面的机器学习（ML）模型。本研究旨在研究元音“a”的发音分割对ML分类器CatBoost （CB）、随机森林（RF）和支持向量机（SVM）性能的可能影响。本研究涉及使用三种不同的数据集结构来训练单个ML模型：全序列、分段式和分组式，这些数据集来源于元音“a”的发音，由属于48名参与者的1058条录音组成。该方法全面分析了每种数据分类如何影响模型的性能和结果。采用网格搜索的嵌套交叉验证方法进行超参数优化。这种严格的方法被用来最小化过拟合风险和最大化模型性能。与全序列数据集相比，研究结果表明，在四段分类中，第二段产生了更高的结果。具体而言，CB模型在验证集和测试集上的准确率分别达到97.8%和84.6%。同一类别的CB模型在真阳性率（TPR）和真阴性率（TNR）方面也表现出最佳的平衡，使其成为临床最有效的选择。这些研究结果表明，元音产生的时间敏感特性对COPD的分类很重要，而分段可以帮助捕捉这些特性。尽管有这些有希望的结果，但数据集的大小和人口统计的同质性限制了通用性，突出了未来研究的领域。该研究已在clinicaltrials.gov注册，ID: NCT06160674。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Vowel segmentation impact on machine learning classification for chronic obstructive pulmonary disease.

查看原文本刊更多论文

Vowel segmentation impact on machine learning classification for chronic obstructive pulmonary disease.

Vowel-based voice analysis is gaining attention as a potential non-invasive tool for COPD classification, offering insights into phonatory function. The growing need for voice data has necessitated the adoption of various techniques, including segmentation, to augment existing datasets for training comprehensive Machine Learning (ML) modelsThis study aims to investigate the possible effects of segmentation of the utterance of vowel "a" on the performance of ML classifiers CatBoost (CB), Random Forest (RF), and Support Vector Machine (SVM). This research involves training individual ML models using three distinct dataset constructions: full-sequence, segment-wise, and group-wise, derived from the utterance of the vowel "a" which consists of 1058 recordings belonging to 48 participants. This approach comprehensively analyzes how each data categorization impacts the model's performance and results. A nested cross-validation (nCV) approach was implemented with grid search for hyperparameter optimization. This rigorous methodology was employed to minimize overfitting risks and maximize model performance. Compared to the full-sequence dataset, the findings indicate that the second segment yielded higher results within the four-segment category. Specifically, the CB model achieved superior accuracy, attaining 97.8% and 84.6% on the validation and test sets, respectively. The same category for the CB model also demonstrated the best balance regarding true positive rate (TPR) and true negative rate (TNR), making it the most clinically effective choice. These findings suggest that time-sensitive properties in vowel production are important for COPD classification and that segmentation can aid in capturing these properties. Despite these promising results, the dataset size and demographic homogeneity limit generalizability, highlighting areas for future research.Trial registration The study is registered on clinicaltrials.gov with ID: NCT06160674.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scientific Reports Natural Science Disciplines-

CiteScore

7.50

自引率

4.30%

发文量

19567

审稿时长

3.9 months

期刊介绍： We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.