AI-driven speech biomarkers for disease diagnosis and monitoring: a systematic review and meta-analysis.

IF 7.6 3区 医学 Q1 MEDICINE, GENERAL & INTERNAL
Yi Yang, Xiaoyan Zhao, Peng Zhao, Dire Ying, Junyu Wang, Yihe Jiang, Qiaoqin Wan
{"title":"AI-driven speech biomarkers for disease diagnosis and monitoring: a systematic review and meta-analysis.","authors":"Yi Yang, Xiaoyan Zhao, Peng Zhao, Dire Ying, Junyu Wang, Yihe Jiang, Qiaoqin Wan","doi":"10.1136/bmjebm-2025-113759","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to comprehensively review the literature on the use of speech biomarkers in disease diagnosis and monitoring, focusing on recording protocols, speech tasks, speech features and processing algorithms.</p><p><strong>Study design: </strong>Systematic review and meta-analysis.</p><p><strong>Data sources: </strong>We conducted a search of six databases: PubMed, Embase, Scopus, Web of Science, PsycINFO and IEEE Xplore, covering studies published from database inception to May 2024.</p><p><strong>Main outcome measures: </strong>The quality of the included studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies tool (QUADAS-2) and the Quality Assessment of Prognostic Accuracy Studies (QUAPAS). Pooled sensitivity and specificity were calculated using a random-effects model. Subgroup analyses examined potential sources of heterogeneity, such as disease type, language, speech tasks, features and algorithms.</p><p><strong>Results: </strong>A total of 96 studies were included, with 83 adopting a cross-sectional design and 50 having sample sizes of fewer than 100 participants. Assessment with QUADAS-2 and QUAPAS revealed that most included studies exhibited a high risk of bias in patient selection and index test domains, while concerns regarding applicability were generally low across studies. These studies covered 20 different diseases, with cognitive disorders, depression and Parkinson's disease being the most frequently studied. The pooled sensitivity and specificity for diagnostic models were 0.80 (95% CI 0.74 to 0.86) and 0.77 (95% CI 0.69 to 0.84) for psychiatric disorders (11 studies, n=2577); 0.85 (95% CI 0.83 to 0.88) and 0.83 (95% CI 0.79 to 0.86) for cognitive disorders (27 studies, n=2068); and 0.81 (95% CI 0.76 to 0.85) and 0.83 (95% CI 0.78 to 0.88) for movement disorders (20 studies, n=852). Further subgroup analyses identified recording device, language, speech task, speech features and algorithm selection as significant contributors to heterogeneity.</p><p><strong>Conclusions: </strong>This review and meta-analysis of 96 studies highlights the influence of devices, environments, languages, tasks, features and algorithms on speech model performance across diseases. While speech biomarkers show promise for screening and monitoring-particularly via smartphones-the high risk of bias in many studies, especially in patient selection and index test interpretation, limits the strength of current evidence. Future large-scale, prospective studies are needed to validate generalisability and support clinical implementation.</p><p><strong>Prospero registration number: </strong>CRD42024551962.</p>","PeriodicalId":9059,"journal":{"name":"BMJ Evidence-Based Medicine","volume":" ","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Evidence-Based Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/bmjebm-2025-113759","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: This study aims to comprehensively review the literature on the use of speech biomarkers in disease diagnosis and monitoring, focusing on recording protocols, speech tasks, speech features and processing algorithms.

Study design: Systematic review and meta-analysis.

Data sources: We conducted a search of six databases: PubMed, Embase, Scopus, Web of Science, PsycINFO and IEEE Xplore, covering studies published from database inception to May 2024.

Main outcome measures: The quality of the included studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies tool (QUADAS-2) and the Quality Assessment of Prognostic Accuracy Studies (QUAPAS). Pooled sensitivity and specificity were calculated using a random-effects model. Subgroup analyses examined potential sources of heterogeneity, such as disease type, language, speech tasks, features and algorithms.

Results: A total of 96 studies were included, with 83 adopting a cross-sectional design and 50 having sample sizes of fewer than 100 participants. Assessment with QUADAS-2 and QUAPAS revealed that most included studies exhibited a high risk of bias in patient selection and index test domains, while concerns regarding applicability were generally low across studies. These studies covered 20 different diseases, with cognitive disorders, depression and Parkinson's disease being the most frequently studied. The pooled sensitivity and specificity for diagnostic models were 0.80 (95% CI 0.74 to 0.86) and 0.77 (95% CI 0.69 to 0.84) for psychiatric disorders (11 studies, n=2577); 0.85 (95% CI 0.83 to 0.88) and 0.83 (95% CI 0.79 to 0.86) for cognitive disorders (27 studies, n=2068); and 0.81 (95% CI 0.76 to 0.85) and 0.83 (95% CI 0.78 to 0.88) for movement disorders (20 studies, n=852). Further subgroup analyses identified recording device, language, speech task, speech features and algorithm selection as significant contributors to heterogeneity.

Conclusions: This review and meta-analysis of 96 studies highlights the influence of devices, environments, languages, tasks, features and algorithms on speech model performance across diseases. While speech biomarkers show promise for screening and monitoring-particularly via smartphones-the high risk of bias in many studies, especially in patient selection and index test interpretation, limits the strength of current evidence. Future large-scale, prospective studies are needed to validate generalisability and support clinical implementation.

Prospero registration number: CRD42024551962.

用于疾病诊断和监测的人工智能驱动的语音生物标志物:系统综述和荟萃分析。
目的:对语音生物标志物在疾病诊断和监测中的应用进行综述,重点从记录协议、语音任务、语音特征和处理算法等方面进行综述。研究设计:系统评价和荟萃分析。数据来源:我们检索了六个数据库:PubMed、Embase、Scopus、Web of Science、PsycINFO和IEEE Xplore,涵盖了从数据库建立到2024年5月发表的研究。主要结局指标:采用诊断准确性研究质量评估工具(QUADAS-2)和预后准确性研究质量评估工具(QUAPAS)对纳入研究的质量进行评估。采用随机效应模型计算合并敏感性和特异性。亚组分析检查了潜在的异质性来源,如疾病类型、语言、语音任务、特征和算法。结果:共纳入96项研究,其中83项采用横断面设计,50项样本量少于100名参与者。使用QUADAS-2和QUAPAS进行的评估显示,大多数纳入的研究在患者选择和指数测试领域显示出较高的偏倚风险,而对适用性的关注在研究中普遍较低。这些研究涵盖了20种不同的疾病,其中认知障碍、抑郁症和帕金森病是最常被研究的。诊断模型对精神疾病的综合敏感性和特异性分别为0.80 (95% CI 0.74 ~ 0.86)和0.77 (95% CI 0.69 ~ 0.84)(11项研究,n=2577);认知障碍为0.85 (95% CI 0.83 ~ 0.88)和0.83 (95% CI 0.79 ~ 0.86)(27项研究,n=2068);运动障碍为0.81 (95% CI 0.76 ~ 0.85)和0.83 (95% CI 0.78 ~ 0.88)(20项研究,n=852)。进一步的分组分析发现录音设备、语言、语音任务、语音特征和算法选择是造成异质性的重要因素。结论:本综述和荟萃分析了96项研究,强调了设备、环境、语言、任务、特征和算法对疾病语音模型性能的影响。虽然语音生物标记物显示出筛查和监测的前景,特别是通过智能手机,但许多研究中的高偏倚风险,特别是在患者选择和指数测试解释方面,限制了当前证据的强度。未来需要大规模的前瞻性研究来验证其普遍性并支持临床实施。普洛斯彼罗注册号:CRD42024551962。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMJ Evidence-Based Medicine
BMJ Evidence-Based Medicine MEDICINE, GENERAL & INTERNAL-
CiteScore
8.90
自引率
3.40%
发文量
48
期刊介绍: BMJ Evidence-Based Medicine (BMJ EBM) publishes original evidence-based research, insights and opinions on what matters for health care. We focus on the tools, methods, and concepts that are basic and central to practising evidence-based medicine and deliver relevant, trustworthy and impactful evidence. BMJ EBM is a Plan S compliant Transformative Journal and adheres to the highest possible industry standards for editorial policies and publication ethics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信