Journal of Voice最新文献

筛选
英文 中文
Evaluation of Visual Feedback for fo and SPL in Subglottal Pressure Measurements-A Methodological Study. 声门下压力测量中视觉反馈的评价-一项方法学研究。
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-04 DOI: 10.1016/j.jvoice.2025.08.005
Anna Lundblad, Maria Södersten, Svante Granqvist
{"title":"Evaluation of Visual Feedback for f<sub>o</sub> and SPL in Subglottal Pressure Measurements-A Methodological Study.","authors":"Anna Lundblad, Maria Södersten, Svante Granqvist","doi":"10.1016/j.jvoice.2025.08.005","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.005","url":null,"abstract":"<p><strong>Objective: </strong>Subglottal pressure is a clinically relevant parameter for assessment of voice disorders and correlates to f<sub>o</sub> and sound pressure level (SPL). The aim of the current study was to evaluate the use of a visual target for feedback of f<sub>o</sub> and SPL in subglottal pressure measurements in habitual voice and at phonation threshold level with a syllable string and a phrase for the purpose of improving the reliability of subglottal pressure measurements.</p><p><strong>Methods: </strong>Data from 12 vocally healthy women (29-61 years) was analyzed. Subglottal pressure was measured and compared in three conditions A: in habitual voice versus at phonation threshold, B: production of a syllable string versus a phrase, C: with visual feedback of f<sub>o</sub> and SPL versus no visual feedback. Two raters analyzed the pressure data for calculations of intra- and interrater reliability and found in general high agreement (Intra-rater agreement between 96% and 98% for habitual voice and 91% and 88% for phonation threshold. Inter-rater agreement was 95% for habitual voice and 80% for phonation threshold.).</p><p><strong>Results: </strong>The procedure generated a large amount of valid pressure data for the habitual voice recordings but not for those at phonation threshold. No major differences were found between the syllable string and the phrase. The main advantage of visual feedback was to control for SPL in recordings of habitual voice, but less advantage was observed at the phonation threshold. Surprisingly, 10 of 12 participants phonated slightly closer to the target f<sub>o</sub> without visual feedback.</p><p><strong>Conclusions: </strong>The procedure appears to improve control of SPL at habitual voice. The control of f<sub>o</sub> was improved for participants with large f<sub>o</sub> deviations but slightly deteriorated for participants with small deviations. Thus, the procedure using visual feedback has the potential to improve the reliability of subglottal pressure measurements.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is the Botulinum Toxin Injection Into the Cricopharyngeal Sphincter Precipitate Laryngopharyngeal Reflux Symptoms in Patients With Retrograde Cricopharyngeal Dysfunction? 环咽括约肌注射肉毒杆菌毒素是否会引起逆行环咽功能障碍患者的咽喉反流症状?
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-04 DOI: 10.1016/j.jvoice.2025.08.021
Jérôme R Lechien, Marie Mailly, Stéphane Hans
{"title":"Is the Botulinum Toxin Injection Into the Cricopharyngeal Sphincter Precipitate Laryngopharyngeal Reflux Symptoms in Patients With Retrograde Cricopharyngeal Dysfunction?","authors":"Jérôme R Lechien, Marie Mailly, Stéphane Hans","doi":"10.1016/j.jvoice.2025.08.021","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.021","url":null,"abstract":"<p><strong>Objective: </strong>To investigate the potential relationship between retrograde cricopharyngeal dysfunction (R-CPD) and laryngopharyngeal reflux disease (LPRD) at baseline and whether cricopharyngeal sphincter paralysis botulinum toxin injection (BTI) is associated with an increase of LPRD symptoms in treated R-CPD patients.</p><p><strong>Methods: </strong>Patients with clinical diagnosis of R-CPD were prospectively recruited from two European hospitals. Controls included individuals unable to burp without troublesome symptoms (CT1) and healthy subjects able to burp (CT2). All participants completed the Burp Score and Reflux Symptom Score-12 (RSS-12) at baseline. R-CPD patients underwent office-based electromyography-guided BTI followed by a 3- to 6-month follow-up evaluation.</p><p><strong>Results: </strong>Forty-two R-CPD patients and 133 gender- and age-matched controls (30 CT1, 103 CT2) completed baseline evaluations. Burp scores were significantly higher in the R-CPD and CT1 groups compared to CT2, with CT1 subjects presenting mild symptom scores significantly exceeding CT2 levels. No significant differences in RSS-12 total scores were observed between R-CPD and CT2 subjects. Among 38 R-CPD patients completing postBTI evaluation (22 responders), RSS-12 total scores remained stable. Dysphonia and dysphagia scores significantly increased post treatment, potentially representing BTI-related adverse events.</p><p><strong>Conclusion: </strong>This preliminary clinical study supports that R-CPD and LPRD are distinct clinical disorders, with BTI treatment improving R-CPD symptoms without significantly increasing LPRD symptoms.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mirtazapine 30 mg as a Potential New Therapy for the Treatment of Laryngeal Sensory Neuropathy. 米氮平30mg作为治疗喉感觉神经病变的潜在新疗法。
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-04 DOI: 10.1016/j.jvoice.2025.08.020
Khaled Mohamed Abdelzaher, Marowa Abd El Wahab, Mostafa Nasr Zayed
{"title":"Mirtazapine 30 mg as a Potential New Therapy for the Treatment of Laryngeal Sensory Neuropathy.","authors":"Khaled Mohamed Abdelzaher, Marowa Abd El Wahab, Mostafa Nasr Zayed","doi":"10.1016/j.jvoice.2025.08.020","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.020","url":null,"abstract":"<p><strong>Objective: </strong>Laryngeal sensory neuropathy (LSN) is an irritating laryngeal disorder that may cause intractable cough, globus sensation, and frequent throat clearing. The diagnosis is typically done by exclusion, even if the postulated etiologies are viral, allergic, or idiopathic. The study aims to introduce mirtazapine 30 mg as a potential new therapy.</p><p><strong>Study design: </strong>Pilot study.</p><p><strong>Subjects and methods: </strong>Eighty LSN patients who did not respond to gabapentin and/or amitriptyline were gathered from Minia University Hospital's otorhinolaryngology department and divided into two groups: The Case Group (n = 40) received mirtazapine. In contrast, the Placebo Group (n = 40) administered a placebo capsule once daily for one month. Patients were asked to rank their symptoms on a scale of 0 to 5 in pretreatment and post treatment questionnaires. Voice Handicap Index (VHI) and acoustic analysis were also used as parameters for evaluation. Evidence of treatment intolerance and adverse effects was also documented.</p><p><strong>Results: </strong>After 1 month of mirtazapine therapy, the mean post treatment chief complaint severity rating was 1.2, while the mean pretreatment one was 3.9. Treatment significantly improved VHI and acoustic scores (P < 0.001) in the case group and differed significantly from the placebo group. Two patients did not tolerate the medications due to dizziness, and four were missed during follow-up.</p><p><strong>Conclusion: </strong>Mirtazapine 30 seems to be a practical therapy choice for LSN. More research is required to compare the results of this treatment protocol with those of other medications used to treat LSN, including systemic drugs or local interventions such as superior laryngeal nerve block injections, to provide additional context and validation for our findings.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feasibility and Acceptability of Complete Vocal Technique-Voice Therapy as a Treatment for Primary Muscle Tension Dysphonia: A Feasibility Trial. 完整发声技术-发声疗法治疗原发性肌张力性发声障碍的可行性和可接受性:一项可行性试验。
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-04 DOI: 10.1016/j.jvoice.2025.07.044
Julian McGlashan, Mathias Aaen, Anna White, Brian Saccente-Kennedy, Mark Tempesta, Cathrine Sadolin
{"title":"Feasibility and Acceptability of Complete Vocal Technique-Voice Therapy as a Treatment for Primary Muscle Tension Dysphonia: A Feasibility Trial.","authors":"Julian McGlashan, Mathias Aaen, Anna White, Brian Saccente-Kennedy, Mark Tempesta, Cathrine Sadolin","doi":"10.1016/j.jvoice.2025.07.044","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.07.044","url":null,"abstract":"<p><strong>Aims and objectives: </strong>Primary muscle tension dysphonia (pMTD) is a common cause of voice disorders and is treated by speech and language pathologists (SLPs). Some singing teachers specializing in the habilitation of the performance voice also have rehabilitation skills helping singers recover from illness. The aim of this pilot study was to assess the feasibility and acceptability of using a structured and well-characterized habilitation and rehabilitation pedagogic technique for singers, The Complete Vocal Technique (CVT), in the treatment of patients with speaking voice problems due to pMTD. The three study objectives were to: 1) assess the feasibility of recruiting and retaining participants in a CVT-VT program; 2) assess the feasibility of using CVT voice therapy (CVT-VT) to improve the voice and voice function; and 3) assess the acceptability of this approach to patients, the CVT practitioner (CVT-P), and the supervising SLP.</p><p><strong>Study design: </strong>Preregistered, uncontrolled, prospective feasibility study.</p><p><strong>Methods: </strong>Patients with pMTD meeting the inclusion criteria in the 6-month trial period were offered up to six telehealth sessions of CVT-VT delivered by the CVT-P. Patients underwent a multidimensional assessment [Voice Handicap Index (VHI), attainment of goals for treatment, Vocal Tract Discomfort Scale (VTDS), acoustic/electroglottographic (EGG) measures of sustained vowels, Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) audio-perceptual evaluation, and maximum phonation time (MPT)] pretherapy and post therapy. Feasibility was assessed by meeting a priori recruitment targets and acceptability assessment framework.</p><p><strong>Results: </strong>Eleven participants completed the study protocol demonstrating recruitment feasibility. All multidimensional measures, except MPT, showed improvement, demonstrating feasibility of CVT-VT to improve the voice, voice function, vocal tract discomfort, and achievement of goals. All patients and the CVT-P rated the acceptability of therapy as either very satisfactory or satisfactory.</p><p><strong>Conclusions: </strong>CVT-VT is a feasible and acceptable form of treatment and warrants further evaluation as an additional tool for voice therapy in patients with pMTD.</p><p><strong>Trial registration: </strong>Clinicaltrials.gov website (NCT05365126 Unique Protocol ID: 19ET004). Registered 06 May 2022, https://beta.</p><p><strong>Clinicaltrials: </strong>gov/study/NCT05365126?patient=Muscle%20Tension%20Dysphonia&locStr=Nottingham,%20UK&lat=52.9540223&lng=-1.1549892&distance=50.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Acoustic Metrics of the Strain Dimension of Voice Quality: A Scoping Review. 话音质量应变尺寸的声学度量:范围综述。
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-03 DOI: 10.1016/j.jvoice.2025.08.018
Chia-Hsin Wu, Lady Catherine Cantor-Cutiva, Eric J Hunter
{"title":"Acoustic Metrics of the Strain Dimension of Voice Quality: A Scoping Review.","authors":"Chia-Hsin Wu, Lady Catherine Cantor-Cutiva, Eric J Hunter","doi":"10.1016/j.jvoice.2025.08.018","DOIUrl":"10.1016/j.jvoice.2025.08.018","url":null,"abstract":"<p><strong>Background: </strong>Strained voice quality-commonly referred to as vocal strain-is a hallmark of functional voice disorders such as muscle tension dysphonia and is often associated with vocal fatigue and laryngeal hyperfunction. Although listeners describe it as excessive vocal effort, strained voice quality frequently overlaps perceptually with breathiness and roughness, complicating reliable assessment. Despite its clinical relevance, no standardized acoustic definition of strained voice quality has been established.</p><p><strong>Purpose: </strong>This review aims to identify and summarize the voice acoustic parameters reported in the literature to quantify the strain dimension of voice quality.</p><p><strong>Methods: </strong>A scoping review with systematic elements was conducted using four databases (ScienceDirect, PubMed, Virtual Health Library, and Web of Science) covering 1996 to 2024. Of 311 identified records, 13 met the inclusion criteria. Extracted data included definitions of vocal strain, perceptual assessment tools, acoustic metrics, and methodological details.</p><p><strong>Results: </strong>Strain was consistently treated as a perceptual attribute of voice quality, often described as the impression of excessive effort. Common acoustic metrics included cepstral peak prominence (CPP), spectral slope, low-to-high (L/H) spectral ratio, and relative fundamental frequency (RFF). While several measures showed moderate-to-strong correlations with perceptual ratings of strain, methodological variability across studies limited direct comparisons and interpretability.</p><p><strong>Conclusion and recommendations: </strong>Strained voice quality is perceptually complex and acoustically multifaceted. While no single metric reliably captures its full scope, acoustic measures-particularly spectral and cepstral features-can complement perceptual assessments. A multimodal approach that integrates listener-based impressions, acoustic analysis, and, where possible, physiological data is recommended to improve diagnostic consistency and guide future research into strain-sensitive voice assessment tools.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145001887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Psychological Correlates of Auditory-Motor Integration in Primary Muscle Tension Dysphonia: A Preliminary Study. 原发性肌肉紧张性语音障碍的听觉-运动整合的心理关联:初步研究。
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-03 DOI: 10.1016/j.jvoice.2025.07.030
Sirvan Savareh Sonj, Farhad Torabinezhad, Arezoo Saffarian, Jamileh Abolghasemi, Roozbeh Behroozmand
{"title":"Psychological Correlates of Auditory-Motor Integration in Primary Muscle Tension Dysphonia: A Preliminary Study.","authors":"Sirvan Savareh Sonj, Farhad Torabinezhad, Arezoo Saffarian, Jamileh Abolghasemi, Roozbeh Behroozmand","doi":"10.1016/j.jvoice.2025.07.030","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.07.030","url":null,"abstract":"<p><strong>Objective: </strong>Primary Muscle Tension Dysphonia (pMTD) is a functional voice disorder characterized by excessive laryngeal muscle tension and vocal hyperfunction, often linked to psychological factors and impaired vocal motor control. This preliminary study investigates the relationship between psychological constructs and auditory-motor integration in pMTD, focusing on vocal compensation responses to altered auditory feedback (AAF).</p><p><strong>Methods: </strong>Twenty-one individuals with pMTD (mean age: 35.4) participated in a reflexive AAF paradigm, producing sustained vowels while receiving brief (±100 cents) pitch-shift perturbations. Vocal compensation magnitude was measured and correlated with scores from the Beck Anxiety Inventory (BAI), Beck Depression Inventory-II (BDI-II), Behavioral Inhibition System/Behavioral Activation System (BIS/BAS) Dominance Ratio, and Voice Handicap Index (VHI). Pearson and Spearman correlations, along with multiple regression analyses, were applied.</p><p><strong>Results: </strong>Significant correlations were found between higher BDI-II scores and reduced compensation for upward (r = 0.314, P < 0.05) and downward (r = -0.447, P < 0.05) pitch shifts. Elevated BIS/BAS dominance ratio was associated with weaker compensation for upward (r = 0.498, P < 0.05) and downward (r = -0.442, P < 0.05) shifts. BDI-II positively correlated with BAI (r = 0.512, P < 0.05) and BIS/BAS Dominance Ratio (r = 0.390, P < 0.05). No significant correlations were observed between BAI or VHI and vocal compensation. However, multiple regression analyses did not identify significant predictors of compensation magnitude, though trends suggested possible roles for depression and BIS dominance.</p><p><strong>Conclusion: </strong>These findings underscore the critical role of psychological factors, particularly depression and BIS dominance, in modulating auditory-motor integration in pMTD, contributing to its pathophysiology. The observed impairments in vocal compensation highlight the need for further research to elucidate these psychomotor mechanisms and their impact on vocal motor control, paving the way for targeted interventions.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145001854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Rare Case of Spontaneous Regression of Laryngeal Amyloidosis. 喉淀粉样变自发性消退1例。
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-02 DOI: 10.1016/j.jvoice.2025.08.023
Talitha Kumaresan Lewis, Denis Lafreniere, Poornima Hegde
{"title":"A Rare Case of Spontaneous Regression of Laryngeal Amyloidosis.","authors":"Talitha Kumaresan Lewis, Denis Lafreniere, Poornima Hegde","doi":"10.1016/j.jvoice.2025.08.023","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.023","url":null,"abstract":"<p><strong>Objective: </strong>To review a rare case of spontaneous regression of laryngeal amyloidosis and current management of the disease.</p><p><strong>Introduction: </strong>While laryngeal amyloidosis is rare (<1% of benign laryngeal lesions), it is the most common site of localized head and neck amyloidosis. The most common presenting symptoms include dysphonia and dyspnea. The mainstay of treatment is surgical excision. We report a rare case of spontaneously regressing laryngeal amyloidosis, which has not been previously reported in the literature to the best of the authors' knowledge.</p><p><strong>Methods: </strong>Single case study.</p><p><strong>Results: </strong>We report the case of an 82-year-old male recently diagnosed with rheumatoid arthritis on methotrexate who presented with 6 months of dysphonia and significant worsening over the past month. Flexible laryngoscopy revealed an irregular mass of the right true vocal cord with hypomobility. Given concern for possible malignancy, a CT scan was obtained, which revealed a multilobulated soft tissue mass of the right true vocal cord with 15 mm caudal extension into the subglottic airway. Pathology was consistent with amyloidosis. Hematology workup revealed no systemic involvement. Subsequent laryngoscopy two months later demonstrated spontaneous regression of 85% of the lesion and return of normal vocal cord mobility. Given significant improvement, surgical intervention was deferred for surveillance monitoring. Eight months after the initial presentation, the patient's dysphonia and lesion continue to regress.</p><p><strong>Conclusion: </strong>While laryngeal amyloidosis is rare (<1% of benign laryngeal lesions), it is the most common site of localized head and neck amyloidosis. We report a rare case of spontaneously regressing laryngeal amyloidosis, which has not been previously reported in the literature to the best of the authors' knowledge. This is a rare phenomenon as laryngeal amyloidosis typically requires surgical excision. As this case highlights, localized laryngeal amyloidosis has favorable outcomes, but diligent surveillance is required given the risk of recurrence.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate Analysis of the Pitch Pulse-Based Magnitude/Phase Structure of Natural Vowels and Assessment of Three Lightweight Time/Frequency Voicing Restoration Methods. 基于音高脉冲的自然元音幅度/相位结构精确分析及三种轻量化时频发声恢复方法评价
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-02 DOI: 10.1016/j.jvoice.2025.08.011
Aníbal J S Ferreira, Luis M T Jesus, Laurentino M M Leal, Jorge E F Spratley
{"title":"Accurate Analysis of the Pitch Pulse-Based Magnitude/Phase Structure of Natural Vowels and Assessment of Three Lightweight Time/Frequency Voicing Restoration Methods.","authors":"Aníbal J S Ferreira, Luis M T Jesus, Laurentino M M Leal, Jorge E F Spratley","doi":"10.1016/j.jvoice.2025.08.011","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.011","url":null,"abstract":"<p><p>This paper addresses two challenges that are intertwined and are key in informing signal processing methods restoring natural (voiced) speech from whispered speech. The first challenge involves characterizing and modeling the evolution of the harmonic phase/magnitude structure of a sequence of individual pitch periods in a voiced region of natural speech comprising sustained or co-articulated vowels. A novel algorithm segmenting individual pitch pulses is proposed, which is then used to obtain illustrative results highlighting important differences between sustained and co-articulated vowels, and suggesting practical synthetic voicing approaches. The second challenge involves model-based synthetic voicing restoration in real-time and on-the-fly. Three implementation alternatives are described that differ in their signal reconstruction approaches: frequency-domain, combined frequency- and time-domain, and physiologically inspired filtering of glottal excitation pulses individually generated. The three alternatives are compared objectively using illustrative examples, and subjectively using the results of listening tests involving synthetic voicing of sustained and co-articulated vowels in word context.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GBNF-VAE: A Pathological Voice Enhancement Model Based on Gold Section for Bottleneck Feature With Variational Autoencoder GBNF-VAE:基于瓶颈特征黄金分割的变分自编码器病理语音增强模型
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-01 DOI: 10.1016/j.jvoice.2023.03.012
Ganjun Liu , Tao Zhang , Biyun Ding , Ying Lv , Xiaohui Hou , Haoyang Guo , Yaqin Wu , Dehui Fu
{"title":"GBNF-VAE: A Pathological Voice Enhancement Model Based on Gold Section for Bottleneck Feature With Variational Autoencoder","authors":"Ganjun Liu ,&nbsp;Tao Zhang ,&nbsp;Biyun Ding ,&nbsp;Ying Lv ,&nbsp;Xiaohui Hou ,&nbsp;Haoyang Guo ,&nbsp;Yaqin Wu ,&nbsp;Dehui Fu","doi":"10.1016/j.jvoice.2023.03.012","DOIUrl":"10.1016/j.jvoice.2023.03.012","url":null,"abstract":"<div><h3>Objective</h3><div>Speech enhancement has become a promising technique to accommodate demands of the improvement in quality of a degraded speech signal. The main works now focus on separating normal speech from noise, but have neglected the low quality of impaired speech influenced by anomalous glottis flow. In order to effectively enhance the pathological speech, it is essential to design a separation mechanism for extracting high-dimensional timbre features and speech features separately to suppress low-dimensional noises.</div></div><div><h3>Methods</h3><div>In this paper, we propose an enhancement model GBNF-VAE to extract timbre efficiently by reducing anomalous airflow noise interference, and by combining the semantic features with timbre features to synthesize the enhanced speech. In particular, the bottleneck feature can characterize the timbre by the controlled number of nodes through the Golden Section method, which effectively improves computational efficiency. In addition, variational autoencoder is adopted to extract semantic features which are combined with the previous timbre features to synthesize the enhanced speech.</div></div><div><h3>Results</h3><div>Finally, spectrum observation, objective indicators and subjective evaluation all show the outstanding performance of GBNF-VAE in pathological speech quality enhancement.</div></div>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":"39 5","pages":"Pages 1171-1182"},"PeriodicalIF":2.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9446654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Acoustical Theory of Vowel Modification Strategies in Belting 带音中元音修饰策略的声学理论
IF 2.4 4区 医学
Journal of Voice Pub Date : 2025-09-01 DOI: 10.1016/j.jvoice.2023.01.004
Christian T. Herbst , Brad H. Story , David Meyer
{"title":"Acoustical Theory of Vowel Modification Strategies in Belting","authors":"Christian T. Herbst ,&nbsp;Brad H. Story ,&nbsp;David Meyer","doi":"10.1016/j.jvoice.2023.01.004","DOIUrl":"10.1016/j.jvoice.2023.01.004","url":null,"abstract":"&lt;div&gt;&lt;div&gt;Various authors have argued that belting is to be produced by “speech-like” sounds, with the first and second supraglottic vocal tract resonances (&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt; and &lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;) at frequencies of the vowels determined by the lyrics to be sung. Acoustically, the hallmark of belting has been identified as a dominant second harmonic, possibly enhanced by first resonance tuning (&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;≈&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mi&gt;o&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;). It is not clear how both these concepts – (a) phonating with “speech-like,” unmodified vowels; and (b) producing a belting sound with a dominant second harmonic, typically enhanced by &lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt; – can be upheld when singing across a singer’s entire musical pitch range. For instance, anecdotal reports from pedagogues suggest that vowels with a low &lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;, such as [i] or [u], might have to be modified considerably (by raising &lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;) in order to phonate at higher pitches. These issues were systematically addressed &lt;em&gt;in silico&lt;/em&gt; with respect to treble singing, using a linear source-filter voice production model. The dominant harmonic of the radiated spectrum was assessed in 12987 simulations, covering a parameter space of 37 fundamental frequencies (&lt;span&gt;&lt;math&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mi&gt;o&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;span&gt;) across the musical pitch range from C3 to C6; 27 voice source spectral slope settings from &lt;/span&gt;&lt;span&gt;&lt;math&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;4 to &lt;span&gt;&lt;math&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;span&gt;30 dB/octave; computed for 13 different IPA vowels. The results suggest that, for most unmodified vowels, the stereotypical belting sound characteristics with a dominant second harmonic can only be produced over a pitch range of about a musical fifth, centered at &lt;/span&gt;&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mi&gt;o&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;≈&lt;/mo&gt;&lt;mn&gt;0.5&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;. In the [ɔ] and [ɑ] vowels, that range is extended to an octave, supported by a low second resonance. Data aggregation – considering the relative prevalence of vowels in American English – suggests that, historically, belting with &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;≈&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mi&gt;o&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; was derived from speech, and that songs with an extended musical pitch range likely demand considerable vowel modification. We thus argue that – on acoustical grounds – the pedagogical commandment for belting with unmodified, “speech-like” vowels can not always be fulfilled.&lt;/div&gt;&lt;/di","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":"39 5","pages":"Pages 1192-1204"},"PeriodicalIF":2.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9414485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信