Plug-and-Play Microphones for Recording Speech and Voice with Smart Devices.

IF 1.1 4区 医学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY
Folia Phoniatrica et Logopaedica Pub Date : 2024-01-01 Epub Date: 2023-11-16 DOI:10.1159/000535152
Gustavo Noffs, Matthew Cobler-Lichter, Thushara Perera, Scott C Kolbe, Helmut Butzkueven, Frederique M C Boonstra, Anneke van der Walt, Adam P Vogel
{"title":"Plug-and-Play Microphones for Recording Speech and Voice with Smart Devices.","authors":"Gustavo Noffs, Matthew Cobler-Lichter, Thushara Perera, Scott C Kolbe, Helmut Butzkueven, Frederique M C Boonstra, Anneke van der Walt, Adam P Vogel","doi":"10.1159/000535152","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Smart devices are widely available and capable of quickly recording and uploading speech segments for health-related analysis. The switch from laboratory recordings with professional-grade microphone setups to remote, smart device-based recordings offers immense potential for the scalability of voice assessment. Yet, a growing body of literature points to a wide heterogeneity among acoustic metrics for their robustness to variation in recording devices. The addition of consumer-grade plug-and-play microphones has been proposed as a possible solution. The aim of our study was to assess if the addition of consumer-grade plug-and-play microphones increases the acoustic measurement agreement between ultra-portable devices and a reference microphone.</p><p><strong>Methods: </strong>Speech was simultaneously recorded by a reference high-quality microphone commonly used in research and by two configurations with plug-and-play microphones. Twelve speech-acoustic features were calculated using recordings from each microphone to determine the agreement intervals in measurements between microphones. Agreement intervals were then compared to expected deviations in speech in various neurological conditions. Each microphone's response to speech and to silence was characterized through acoustic analysis to explore possible reasons for differences in acoustic measurements between microphones. The statistical differentiation of two groups, neurotypical and people with multiple sclerosis, using metrics from each tested microphone was compared to that of the reference microphone.</p><p><strong>Results: </strong>The two consumer-grade plug-and-play microphones favored high frequencies (mean center of gravity difference ≥ +175.3 Hz) and recorded more noise (mean difference in signal to noise ≤ -4.2 dB) when compared to the reference microphone. Between consumer-grade microphones, differences in relative noise were closely related to distance between the microphone and the speaker's mouth. Agreement intervals between the reference and consumer-grade microphones remained under disease-expected deviations only for fundamental frequency (f0, agreement interval ≤0.06 Hz), f0 instability (f0 CoV, agreement interval ≤0.05%), and tracking of second formant movement (agreement interval ≤1.4 Hz/ms). Agreement between microphones was poor for other metrics, particularly for fine timing metrics (mean pause length and pause length variability for various tasks). The statistical difference between the two groups of speakers was smaller with the plug-and-play than with the reference microphone.</p><p><strong>Conclusion: </strong>Measurement of f0 and F2 slope was robust to variation in recording equipment, while other acoustic metrics were not. Thus, the tested plug-and-play microphones should not be used interchangeably with professional-grade microphones for speech analysis. Plug-and-play microphones may assist in equipment standardization within speech studies, including remote or self-recording, possibly with small loss in accuracy and statistical power as observed in the current study.</p>","PeriodicalId":12114,"journal":{"name":"Folia Phoniatrica et Logopaedica","volume":" ","pages":"372-385"},"PeriodicalIF":1.1000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11309067/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Folia Phoniatrica et Logopaedica","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1159/000535152","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/16 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Smart devices are widely available and capable of quickly recording and uploading speech segments for health-related analysis. The switch from laboratory recordings with professional-grade microphone setups to remote, smart device-based recordings offers immense potential for the scalability of voice assessment. Yet, a growing body of literature points to a wide heterogeneity among acoustic metrics for their robustness to variation in recording devices. The addition of consumer-grade plug-and-play microphones has been proposed as a possible solution. The aim of our study was to assess if the addition of consumer-grade plug-and-play microphones increases the acoustic measurement agreement between ultra-portable devices and a reference microphone.

Methods: Speech was simultaneously recorded by a reference high-quality microphone commonly used in research and by two configurations with plug-and-play microphones. Twelve speech-acoustic features were calculated using recordings from each microphone to determine the agreement intervals in measurements between microphones. Agreement intervals were then compared to expected deviations in speech in various neurological conditions. Each microphone's response to speech and to silence was characterized through acoustic analysis to explore possible reasons for differences in acoustic measurements between microphones. The statistical differentiation of two groups, neurotypical and people with multiple sclerosis, using metrics from each tested microphone was compared to that of the reference microphone.

Results: The two consumer-grade plug-and-play microphones favored high frequencies (mean center of gravity difference ≥ +175.3 Hz) and recorded more noise (mean difference in signal to noise ≤ -4.2 dB) when compared to the reference microphone. Between consumer-grade microphones, differences in relative noise were closely related to distance between the microphone and the speaker's mouth. Agreement intervals between the reference and consumer-grade microphones remained under disease-expected deviations only for fundamental frequency (f0, agreement interval ≤0.06 Hz), f0 instability (f0 CoV, agreement interval ≤0.05%), and tracking of second formant movement (agreement interval ≤1.4 Hz/ms). Agreement between microphones was poor for other metrics, particularly for fine timing metrics (mean pause length and pause length variability for various tasks). The statistical difference between the two groups of speakers was smaller with the plug-and-play than with the reference microphone.

Conclusion: Measurement of f0 and F2 slope was robust to variation in recording equipment, while other acoustic metrics were not. Thus, the tested plug-and-play microphones should not be used interchangeably with professional-grade microphones for speech analysis. Plug-and-play microphones may assist in equipment standardization within speech studies, including remote or self-recording, possibly with small loss in accuracy and statistical power as observed in the current study.

即插即用麦克风,用于录制语音和智能设备的声音。
智能设备广泛使用,能够快速录制和上传语音片段,用于健康相关分析。从具有专业级麦克风设置的实验室录音切换到远程,基于智能设备的录音,为语音评估的可扩展性提供了巨大的潜力。然而,越来越多的文献指出,声学指标对记录设备变化的鲁棒性存在广泛的异质性。增加消费级即插即用麦克风已被提议作为一种可能的解决方案。我们的目的是评估消费级即插即用麦克风的增加是否会增加超便携设备和参考麦克风之间的声学测量一致性。方法采用研究中常用的参考高质量麦克风和两种配置的即插即用麦克风同时录音。使用每个麦克风的录音计算12个语音声学特征,以确定麦克风之间测量的一致性间隔。然后将同意间隔与不同神经系统条件下言语的预期偏差进行比较。通过声学分析来表征每个麦克风对语音和静音的响应,以探索麦克风之间声学测量差异的可能原因。两组,神经正常和多发性硬化症患者,使用每个测试麦克风的指标与参考麦克风的指标进行比较。结果与参考麦克风相比,两种消费级即插即用麦克风更倾向于高频(平均重心差≥+175.3Hz),记录的噪声更大(平均信噪差≤-4.2dB)。在消费级麦克风之间,相对噪声的差异与麦克风与说话者嘴之间的距离密切相关。参考级和消费级麦克风之间的一致性间隔仅在基频(f0,一致性间隔≤0.06Hz)、f0不稳定性(f0 CoV,一致性间隔≤0.05%)和第二峰运动跟踪(一致性间隔≤1.4Hz/毫秒)下保持在疾病预期偏差下。麦克风之间在其他指标上的一致性很差,特别是在精细计时指标上(各种任务的平均暂停长度和暂停长度可变性)。使用即插即用麦克风的两组扬声器之间的统计差异小于使用参考麦克风的两组扬声器。结论f0和F2坡度测量值对记录设备的变化具有较强的稳稳性,而其他声学指标则没有。因此,测试的即插即用麦克风不应与专业级麦克风互换使用,以进行语音分析。即插即用式麦克风可能有助于语音研究中的设备标准化,包括远程或自录,可能在本研究中观察到的准确性和统计能力上有较小的损失。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Folia Phoniatrica et Logopaedica
Folia Phoniatrica et Logopaedica AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY-OTORHINOLARYNGOLOGY
CiteScore
2.30
自引率
10.00%
发文量
28
审稿时长
>12 weeks
期刊介绍: Published since 1947, ''Folia Phoniatrica et Logopaedica'' provides a forum for international research on the anatomy, physiology, and pathology of structures of the speech, language, and hearing mechanisms. Original papers published in this journal report new findings on basic function, assessment, management, and test development in communication sciences and disorders, as well as experiments designed to test specific theories of speech, language, and hearing function. Review papers of high quality are also welcomed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信