[基于多种机器学习算法和声音情绪特征的阈下抑郁识别模型构建]。

Q3 Medicine
Meimei Chen, Yang Wang, Huangwei Lei, Fei Zhang, Ruina Huang, Zhaoyang Yang
{"title":"[基于多种机器学习算法和声音情绪特征的阈下抑郁识别模型构建]。","authors":"Meimei Chen, Yang Wang, Huangwei Lei, Fei Zhang, Ruina Huang, Zhaoyang Yang","doi":"10.12122/j.issn.1673-4254.2025.04.05","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To construct vocal recognition classification models using 6 machine learning algorithms and vocal emotional characteristics of individuals with subthreshold depression to facilitate early identification of subthreshold depression.</p><p><strong>Methods: </strong>We collected voice data from both normal individuals and participants with subthreshold depression by asking them to read specifically chosen words and texts. From each voice sample, 384-dimensional vocal emotional feature variables were extracted, including energy feature, Meir frequency cepstrum coefficient, zero cross rate feature, sound probability feature, fundamental frequency feature, difference feature. The Recursive Feature Elimination (RFE) method was employed to select voice feature variables. Classification models were then built using the machine learning algorithms Adaptive Boosting (AdaBoost), Random Forest (RF), Linear Discriminant Analysis (LDA), Logistic Regression (LR), Lasso Regression (LRLasso), and Support Vector Machine (SVM), and the performance of these models was evaluated. To assess generalization capability of the models, we used real-world speech data to evaluate the best speech recognition classification model.</p><p><strong>Results: </strong>The AdaBoost, RF, and LDA models achieved high prediction accuracies of 100%, 100%, and 93.3% on word-reading speech test set, respectively. In the text-reading speech test set, the accuracies of the AdaBoost, RF, and LDA models were 90%, 80%, and 90%, respectively, while the accuracies of the other 3 models were all below 80%. On real-world word-reading and text-reading speech data, the classification models using AdaBoost and Random Forest still achieved high predictive accuracies (91.7% and 80.6% for AdaBoost and 86.1% and 77.8% for Random, respectively).</p><p><strong>Conclusions: </strong>Analyzing vocal emotional characteristics allows effective identification of individuals with subthreshold depression. The AdaBoost and RF models show excellent performance for classifying subthreshold depression individuals, and may thus potentially offer valuable assistance in the clinical and research settings.</p>","PeriodicalId":18962,"journal":{"name":"南方医科大学学报杂志","volume":"45 4","pages":"711-717"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12037279/pdf/","citationCount":"0","resultStr":"{\"title\":\"[Construction of recognition models for subthreshold depression based on multiple machine learning algorithms and vocal emotional characteristics].\",\"authors\":\"Meimei Chen, Yang Wang, Huangwei Lei, Fei Zhang, Ruina Huang, Zhaoyang Yang\",\"doi\":\"10.12122/j.issn.1673-4254.2025.04.05\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>To construct vocal recognition classification models using 6 machine learning algorithms and vocal emotional characteristics of individuals with subthreshold depression to facilitate early identification of subthreshold depression.</p><p><strong>Methods: </strong>We collected voice data from both normal individuals and participants with subthreshold depression by asking them to read specifically chosen words and texts. From each voice sample, 384-dimensional vocal emotional feature variables were extracted, including energy feature, Meir frequency cepstrum coefficient, zero cross rate feature, sound probability feature, fundamental frequency feature, difference feature. The Recursive Feature Elimination (RFE) method was employed to select voice feature variables. Classification models were then built using the machine learning algorithms Adaptive Boosting (AdaBoost), Random Forest (RF), Linear Discriminant Analysis (LDA), Logistic Regression (LR), Lasso Regression (LRLasso), and Support Vector Machine (SVM), and the performance of these models was evaluated. To assess generalization capability of the models, we used real-world speech data to evaluate the best speech recognition classification model.</p><p><strong>Results: </strong>The AdaBoost, RF, and LDA models achieved high prediction accuracies of 100%, 100%, and 93.3% on word-reading speech test set, respectively. In the text-reading speech test set, the accuracies of the AdaBoost, RF, and LDA models were 90%, 80%, and 90%, respectively, while the accuracies of the other 3 models were all below 80%. On real-world word-reading and text-reading speech data, the classification models using AdaBoost and Random Forest still achieved high predictive accuracies (91.7% and 80.6% for AdaBoost and 86.1% and 77.8% for Random, respectively).</p><p><strong>Conclusions: </strong>Analyzing vocal emotional characteristics allows effective identification of individuals with subthreshold depression. The AdaBoost and RF models show excellent performance for classifying subthreshold depression individuals, and may thus potentially offer valuable assistance in the clinical and research settings.</p>\",\"PeriodicalId\":18962,\"journal\":{\"name\":\"南方医科大学学报杂志\",\"volume\":\"45 4\",\"pages\":\"711-717\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12037279/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"南方医科大学学报杂志\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12122/j.issn.1673-4254.2025.04.05\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"南方医科大学学报杂志","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12122/j.issn.1673-4254.2025.04.05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

摘要

目的:利用6种机器学习算法和阈下抑郁症患者的声音情绪特征构建声音识别分类模型,促进阈下抑郁症的早期识别。方法:我们通过要求正常人和阈下抑郁症参与者阅读特定的单词和文本来收集他们的语音数据。从每个语音样本中提取384维的语音情感特征变量,包括能量特征、梅尔频率倒频谱系数、零交叉率特征、声音概率特征、基频特征、差分特征。采用递归特征消除(RFE)方法选择语音特征变量。然后使用机器学习算法自适应增强(AdaBoost)、随机森林(RF)、线性判别分析(LDA)、逻辑回归(LR)、Lasso回归(LRLasso)和支持向量机(SVM)建立分类模型,并对这些模型的性能进行评估。为了评估模型的泛化能力,我们使用真实语音数据来评估最佳语音识别分类模型。结果:AdaBoost、RF和LDA模型在单词阅读语音测试集上的预测准确率分别达到100%、100%和93.3%。在文本阅读语音测试集中,AdaBoost、RF和LDA模型的准确率分别为90%、80%和90%,而其他3个模型的准确率均在80%以下。在现实世界的单词阅读和文本阅读语音数据上,使用AdaBoost和Random Forest的分类模型仍然取得了很高的预测准确率(AdaBoost分别为91.7%和80.6%,Random为86.1%和77.8%)。结论:分析声音情绪特征可以有效地识别阈下抑郁症患者。AdaBoost和RF模型在分类阈下抑郁症个体方面表现出色,因此可能在临床和研究环境中提供有价值的帮助。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
[Construction of recognition models for subthreshold depression based on multiple machine learning algorithms and vocal emotional characteristics].

Objectives: To construct vocal recognition classification models using 6 machine learning algorithms and vocal emotional characteristics of individuals with subthreshold depression to facilitate early identification of subthreshold depression.

Methods: We collected voice data from both normal individuals and participants with subthreshold depression by asking them to read specifically chosen words and texts. From each voice sample, 384-dimensional vocal emotional feature variables were extracted, including energy feature, Meir frequency cepstrum coefficient, zero cross rate feature, sound probability feature, fundamental frequency feature, difference feature. The Recursive Feature Elimination (RFE) method was employed to select voice feature variables. Classification models were then built using the machine learning algorithms Adaptive Boosting (AdaBoost), Random Forest (RF), Linear Discriminant Analysis (LDA), Logistic Regression (LR), Lasso Regression (LRLasso), and Support Vector Machine (SVM), and the performance of these models was evaluated. To assess generalization capability of the models, we used real-world speech data to evaluate the best speech recognition classification model.

Results: The AdaBoost, RF, and LDA models achieved high prediction accuracies of 100%, 100%, and 93.3% on word-reading speech test set, respectively. In the text-reading speech test set, the accuracies of the AdaBoost, RF, and LDA models were 90%, 80%, and 90%, respectively, while the accuracies of the other 3 models were all below 80%. On real-world word-reading and text-reading speech data, the classification models using AdaBoost and Random Forest still achieved high predictive accuracies (91.7% and 80.6% for AdaBoost and 86.1% and 77.8% for Random, respectively).

Conclusions: Analyzing vocal emotional characteristics allows effective identification of individuals with subthreshold depression. The AdaBoost and RF models show excellent performance for classifying subthreshold depression individuals, and may thus potentially offer valuable assistance in the clinical and research settings.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
南方医科大学学报杂志
南方医科大学学报杂志 Medicine-Medicine (all)
CiteScore
1.50
自引率
0.00%
发文量
208
期刊介绍:
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信