Itsuki Kitayama, Kiyohito Hosokawa, Bernhard Lehnert, Kenji Aruga, Hidenori Inohara, Ben Barsties V Latoszek
{"title":"Cross-Validation of the Acoustic Roughness Index in German.","authors":"Itsuki Kitayama, Kiyohito Hosokawa, Bernhard Lehnert, Kenji Aruga, Hidenori Inohara, Ben Barsties V Latoszek","doi":"10.1016/j.jvoice.2025.09.030","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The aim of this study was to validate the Acoustic Roughness Index (ARI) for German-speaking participants by examining its correlation with perceived vocal roughness and its diagnostic accuracy in distinguishing rough from non-rough voices.</p><p><strong>Methods: </strong>Voice samples from 218 adult speakers (175 with dysphonia and 43 vocally healthy controls) were recorded using a sustained vowel /a:/ and a standardized 27-syllable passage of continuous speech (approximately 3 seconds) concatenated into a single sample per participant. Three experienced raters judged the roughness severity of each sample using the R-parameter from the Grade, Roughness, Breathiness, Asthenia, Strain scale (ranging from normal to severe). Intra- and inter-rater reliability were assessed with Cohen's kappa and Fleiss' kappa, respectively. Acoustic analysis was performed using the ARI algorithm implemented in the software Praat. Concurrent validity was evaluated by Spearman rank correlation (r<sub>s</sub>) between ARI scores and perceptual roughness. Diagnostic validity was assessed via receiver operating characteristic (ROC) analysis determining the optimal ARI threshold for identifying rough voices.</p><p><strong>Results: </strong>Intra-rater reliability for roughness was moderate (mean Cohen's κ = 0.45) and inter-rater agreement was fair (Fleiss' κ = 0.35) indicating the inherent variability of perceptual roughness judgments. ARI scores demonstrated a sufficiently high correlation with perceived roughness (r<sub>s</sub> = 0.726, P < 0.001, 95% confidence interval [CI] = 0.654-0.785). The area under the ROC curve was 0.824 reflecting good diagnostic accuracy. The (Youden-) optimal ARI threshold was 2.00 yielding 71.8% sensitivity and 79.3% specificity.</p><p><strong>Conclusion: </strong>ARI appears to offer a potentially useful acoustic measure for assessing vocal roughness, though its robustness may be limited. Further research is necessary to improve the accuracy and reliability of the voice quality evaluation of roughness.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jvoice.2025.09.030","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: The aim of this study was to validate the Acoustic Roughness Index (ARI) for German-speaking participants by examining its correlation with perceived vocal roughness and its diagnostic accuracy in distinguishing rough from non-rough voices.
Methods: Voice samples from 218 adult speakers (175 with dysphonia and 43 vocally healthy controls) were recorded using a sustained vowel /a:/ and a standardized 27-syllable passage of continuous speech (approximately 3 seconds) concatenated into a single sample per participant. Three experienced raters judged the roughness severity of each sample using the R-parameter from the Grade, Roughness, Breathiness, Asthenia, Strain scale (ranging from normal to severe). Intra- and inter-rater reliability were assessed with Cohen's kappa and Fleiss' kappa, respectively. Acoustic analysis was performed using the ARI algorithm implemented in the software Praat. Concurrent validity was evaluated by Spearman rank correlation (rs) between ARI scores and perceptual roughness. Diagnostic validity was assessed via receiver operating characteristic (ROC) analysis determining the optimal ARI threshold for identifying rough voices.
Results: Intra-rater reliability for roughness was moderate (mean Cohen's κ = 0.45) and inter-rater agreement was fair (Fleiss' κ = 0.35) indicating the inherent variability of perceptual roughness judgments. ARI scores demonstrated a sufficiently high correlation with perceived roughness (rs = 0.726, P < 0.001, 95% confidence interval [CI] = 0.654-0.785). The area under the ROC curve was 0.824 reflecting good diagnostic accuracy. The (Youden-) optimal ARI threshold was 2.00 yielding 71.8% sensitivity and 79.3% specificity.
Conclusion: ARI appears to offer a potentially useful acoustic measure for assessing vocal roughness, though its robustness may be limited. Further research is necessary to improve the accuracy and reliability of the voice quality evaluation of roughness.
期刊介绍:
The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.