Journal of VoicePub Date : 2025-09-04DOI: 10.1016/j.jvoice.2025.08.005
Anna Lundblad, Maria Södersten, Svante Granqvist
{"title":"Evaluation of Visual Feedback for f<sub>o</sub> and SPL in Subglottal Pressure Measurements-A Methodological Study.","authors":"Anna Lundblad, Maria Södersten, Svante Granqvist","doi":"10.1016/j.jvoice.2025.08.005","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.005","url":null,"abstract":"<p><strong>Objective: </strong>Subglottal pressure is a clinically relevant parameter for assessment of voice disorders and correlates to f<sub>o</sub> and sound pressure level (SPL). The aim of the current study was to evaluate the use of a visual target for feedback of f<sub>o</sub> and SPL in subglottal pressure measurements in habitual voice and at phonation threshold level with a syllable string and a phrase for the purpose of improving the reliability of subglottal pressure measurements.</p><p><strong>Methods: </strong>Data from 12 vocally healthy women (29-61 years) was analyzed. Subglottal pressure was measured and compared in three conditions A: in habitual voice versus at phonation threshold, B: production of a syllable string versus a phrase, C: with visual feedback of f<sub>o</sub> and SPL versus no visual feedback. Two raters analyzed the pressure data for calculations of intra- and interrater reliability and found in general high agreement (Intra-rater agreement between 96% and 98% for habitual voice and 91% and 88% for phonation threshold. Inter-rater agreement was 95% for habitual voice and 80% for phonation threshold.).</p><p><strong>Results: </strong>The procedure generated a large amount of valid pressure data for the habitual voice recordings but not for those at phonation threshold. No major differences were found between the syllable string and the phrase. The main advantage of visual feedback was to control for SPL in recordings of habitual voice, but less advantage was observed at the phonation threshold. Surprisingly, 10 of 12 participants phonated slightly closer to the target f<sub>o</sub> without visual feedback.</p><p><strong>Conclusions: </strong>The procedure appears to improve control of SPL at habitual voice. The control of f<sub>o</sub> was improved for participants with large f<sub>o</sub> deviations but slightly deteriorated for participants with small deviations. Thus, the procedure using visual feedback has the potential to improve the reliability of subglottal pressure measurements.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Journal of VoicePub Date : 2025-09-04DOI: 10.1016/j.jvoice.2025.08.021
Jérôme R Lechien, Marie Mailly, Stéphane Hans
{"title":"Is the Botulinum Toxin Injection Into the Cricopharyngeal Sphincter Precipitate Laryngopharyngeal Reflux Symptoms in Patients With Retrograde Cricopharyngeal Dysfunction?","authors":"Jérôme R Lechien, Marie Mailly, Stéphane Hans","doi":"10.1016/j.jvoice.2025.08.021","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.021","url":null,"abstract":"<p><strong>Objective: </strong>To investigate the potential relationship between retrograde cricopharyngeal dysfunction (R-CPD) and laryngopharyngeal reflux disease (LPRD) at baseline and whether cricopharyngeal sphincter paralysis botulinum toxin injection (BTI) is associated with an increase of LPRD symptoms in treated R-CPD patients.</p><p><strong>Methods: </strong>Patients with clinical diagnosis of R-CPD were prospectively recruited from two European hospitals. Controls included individuals unable to burp without troublesome symptoms (CT1) and healthy subjects able to burp (CT2). All participants completed the Burp Score and Reflux Symptom Score-12 (RSS-12) at baseline. R-CPD patients underwent office-based electromyography-guided BTI followed by a 3- to 6-month follow-up evaluation.</p><p><strong>Results: </strong>Forty-two R-CPD patients and 133 gender- and age-matched controls (30 CT1, 103 CT2) completed baseline evaluations. Burp scores were significantly higher in the R-CPD and CT1 groups compared to CT2, with CT1 subjects presenting mild symptom scores significantly exceeding CT2 levels. No significant differences in RSS-12 total scores were observed between R-CPD and CT2 subjects. Among 38 R-CPD patients completing postBTI evaluation (22 responders), RSS-12 total scores remained stable. Dysphonia and dysphagia scores significantly increased post treatment, potentially representing BTI-related adverse events.</p><p><strong>Conclusion: </strong>This preliminary clinical study supports that R-CPD and LPRD are distinct clinical disorders, with BTI treatment improving R-CPD symptoms without significantly increasing LPRD symptoms.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Journal of VoicePub Date : 2025-09-04DOI: 10.1016/j.jvoice.2025.08.020
Khaled Mohamed Abdelzaher, Marowa Abd El Wahab, Mostafa Nasr Zayed
{"title":"Mirtazapine 30 mg as a Potential New Therapy for the Treatment of Laryngeal Sensory Neuropathy.","authors":"Khaled Mohamed Abdelzaher, Marowa Abd El Wahab, Mostafa Nasr Zayed","doi":"10.1016/j.jvoice.2025.08.020","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.020","url":null,"abstract":"<p><strong>Objective: </strong>Laryngeal sensory neuropathy (LSN) is an irritating laryngeal disorder that may cause intractable cough, globus sensation, and frequent throat clearing. The diagnosis is typically done by exclusion, even if the postulated etiologies are viral, allergic, or idiopathic. The study aims to introduce mirtazapine 30 mg as a potential new therapy.</p><p><strong>Study design: </strong>Pilot study.</p><p><strong>Subjects and methods: </strong>Eighty LSN patients who did not respond to gabapentin and/or amitriptyline were gathered from Minia University Hospital's otorhinolaryngology department and divided into two groups: The Case Group (n = 40) received mirtazapine. In contrast, the Placebo Group (n = 40) administered a placebo capsule once daily for one month. Patients were asked to rank their symptoms on a scale of 0 to 5 in pretreatment and post treatment questionnaires. Voice Handicap Index (VHI) and acoustic analysis were also used as parameters for evaluation. Evidence of treatment intolerance and adverse effects was also documented.</p><p><strong>Results: </strong>After 1 month of mirtazapine therapy, the mean post treatment chief complaint severity rating was 1.2, while the mean pretreatment one was 3.9. Treatment significantly improved VHI and acoustic scores (P < 0.001) in the case group and differed significantly from the placebo group. Two patients did not tolerate the medications due to dizziness, and four were missed during follow-up.</p><p><strong>Conclusion: </strong>Mirtazapine 30 seems to be a practical therapy choice for LSN. More research is required to compare the results of this treatment protocol with those of other medications used to treat LSN, including systemic drugs or local interventions such as superior laryngeal nerve block injections, to provide additional context and validation for our findings.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Journal of VoicePub Date : 2025-09-04DOI: 10.1016/j.jvoice.2025.07.044
Julian McGlashan, Mathias Aaen, Anna White, Brian Saccente-Kennedy, Mark Tempesta, Cathrine Sadolin
{"title":"Feasibility and Acceptability of Complete Vocal Technique-Voice Therapy as a Treatment for Primary Muscle Tension Dysphonia: A Feasibility Trial.","authors":"Julian McGlashan, Mathias Aaen, Anna White, Brian Saccente-Kennedy, Mark Tempesta, Cathrine Sadolin","doi":"10.1016/j.jvoice.2025.07.044","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.07.044","url":null,"abstract":"<p><strong>Aims and objectives: </strong>Primary muscle tension dysphonia (pMTD) is a common cause of voice disorders and is treated by speech and language pathologists (SLPs). Some singing teachers specializing in the habilitation of the performance voice also have rehabilitation skills helping singers recover from illness. The aim of this pilot study was to assess the feasibility and acceptability of using a structured and well-characterized habilitation and rehabilitation pedagogic technique for singers, The Complete Vocal Technique (CVT), in the treatment of patients with speaking voice problems due to pMTD. The three study objectives were to: 1) assess the feasibility of recruiting and retaining participants in a CVT-VT program; 2) assess the feasibility of using CVT voice therapy (CVT-VT) to improve the voice and voice function; and 3) assess the acceptability of this approach to patients, the CVT practitioner (CVT-P), and the supervising SLP.</p><p><strong>Study design: </strong>Preregistered, uncontrolled, prospective feasibility study.</p><p><strong>Methods: </strong>Patients with pMTD meeting the inclusion criteria in the 6-month trial period were offered up to six telehealth sessions of CVT-VT delivered by the CVT-P. Patients underwent a multidimensional assessment [Voice Handicap Index (VHI), attainment of goals for treatment, Vocal Tract Discomfort Scale (VTDS), acoustic/electroglottographic (EGG) measures of sustained vowels, Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) audio-perceptual evaluation, and maximum phonation time (MPT)] pretherapy and post therapy. Feasibility was assessed by meeting a priori recruitment targets and acceptability assessment framework.</p><p><strong>Results: </strong>Eleven participants completed the study protocol demonstrating recruitment feasibility. All multidimensional measures, except MPT, showed improvement, demonstrating feasibility of CVT-VT to improve the voice, voice function, vocal tract discomfort, and achievement of goals. All patients and the CVT-P rated the acceptability of therapy as either very satisfactory or satisfactory.</p><p><strong>Conclusions: </strong>CVT-VT is a feasible and acceptable form of treatment and warrants further evaluation as an additional tool for voice therapy in patients with pMTD.</p><p><strong>Trial registration: </strong>Clinicaltrials.gov website (NCT05365126 Unique Protocol ID: 19ET004). Registered 06 May 2022, https://beta.</p><p><strong>Clinicaltrials: </strong>gov/study/NCT05365126?patient=Muscle%20Tension%20Dysphonia&locStr=Nottingham,%20UK&lat=52.9540223&lng=-1.1549892&distance=50.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145006778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Journal of VoicePub Date : 2025-09-03DOI: 10.1016/j.jvoice.2025.08.018
Chia-Hsin Wu, Lady Catherine Cantor-Cutiva, Eric J Hunter
{"title":"Acoustic Metrics of the Strain Dimension of Voice Quality: A Scoping Review.","authors":"Chia-Hsin Wu, Lady Catherine Cantor-Cutiva, Eric J Hunter","doi":"10.1016/j.jvoice.2025.08.018","DOIUrl":"10.1016/j.jvoice.2025.08.018","url":null,"abstract":"<p><strong>Background: </strong>Strained voice quality-commonly referred to as vocal strain-is a hallmark of functional voice disorders such as muscle tension dysphonia and is often associated with vocal fatigue and laryngeal hyperfunction. Although listeners describe it as excessive vocal effort, strained voice quality frequently overlaps perceptually with breathiness and roughness, complicating reliable assessment. Despite its clinical relevance, no standardized acoustic definition of strained voice quality has been established.</p><p><strong>Purpose: </strong>This review aims to identify and summarize the voice acoustic parameters reported in the literature to quantify the strain dimension of voice quality.</p><p><strong>Methods: </strong>A scoping review with systematic elements was conducted using four databases (ScienceDirect, PubMed, Virtual Health Library, and Web of Science) covering 1996 to 2024. Of 311 identified records, 13 met the inclusion criteria. Extracted data included definitions of vocal strain, perceptual assessment tools, acoustic metrics, and methodological details.</p><p><strong>Results: </strong>Strain was consistently treated as a perceptual attribute of voice quality, often described as the impression of excessive effort. Common acoustic metrics included cepstral peak prominence (CPP), spectral slope, low-to-high (L/H) spectral ratio, and relative fundamental frequency (RFF). While several measures showed moderate-to-strong correlations with perceptual ratings of strain, methodological variability across studies limited direct comparisons and interpretability.</p><p><strong>Conclusion and recommendations: </strong>Strained voice quality is perceptually complex and acoustically multifaceted. While no single metric reliably captures its full scope, acoustic measures-particularly spectral and cepstral features-can complement perceptual assessments. A multimodal approach that integrates listener-based impressions, acoustic analysis, and, where possible, physiological data is recommended to improve diagnostic consistency and guide future research into strain-sensitive voice assessment tools.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145001887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Psychological Correlates of Auditory-Motor Integration in Primary Muscle Tension Dysphonia: A Preliminary Study.","authors":"Sirvan Savareh Sonj, Farhad Torabinezhad, Arezoo Saffarian, Jamileh Abolghasemi, Roozbeh Behroozmand","doi":"10.1016/j.jvoice.2025.07.030","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.07.030","url":null,"abstract":"<p><strong>Objective: </strong>Primary Muscle Tension Dysphonia (pMTD) is a functional voice disorder characterized by excessive laryngeal muscle tension and vocal hyperfunction, often linked to psychological factors and impaired vocal motor control. This preliminary study investigates the relationship between psychological constructs and auditory-motor integration in pMTD, focusing on vocal compensation responses to altered auditory feedback (AAF).</p><p><strong>Methods: </strong>Twenty-one individuals with pMTD (mean age: 35.4) participated in a reflexive AAF paradigm, producing sustained vowels while receiving brief (±100 cents) pitch-shift perturbations. Vocal compensation magnitude was measured and correlated with scores from the Beck Anxiety Inventory (BAI), Beck Depression Inventory-II (BDI-II), Behavioral Inhibition System/Behavioral Activation System (BIS/BAS) Dominance Ratio, and Voice Handicap Index (VHI). Pearson and Spearman correlations, along with multiple regression analyses, were applied.</p><p><strong>Results: </strong>Significant correlations were found between higher BDI-II scores and reduced compensation for upward (r = 0.314, P < 0.05) and downward (r = -0.447, P < 0.05) pitch shifts. Elevated BIS/BAS dominance ratio was associated with weaker compensation for upward (r = 0.498, P < 0.05) and downward (r = -0.442, P < 0.05) shifts. BDI-II positively correlated with BAI (r = 0.512, P < 0.05) and BIS/BAS Dominance Ratio (r = 0.390, P < 0.05). No significant correlations were observed between BAI or VHI and vocal compensation. However, multiple regression analyses did not identify significant predictors of compensation magnitude, though trends suggested possible roles for depression and BIS dominance.</p><p><strong>Conclusion: </strong>These findings underscore the critical role of psychological factors, particularly depression and BIS dominance, in modulating auditory-motor integration in pMTD, contributing to its pathophysiology. The observed impairments in vocal compensation highlight the need for further research to elucidate these psychomotor mechanisms and their impact on vocal motor control, paving the way for targeted interventions.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145001854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Rare Case of Spontaneous Regression of Laryngeal Amyloidosis.","authors":"Talitha Kumaresan Lewis, Denis Lafreniere, Poornima Hegde","doi":"10.1016/j.jvoice.2025.08.023","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.023","url":null,"abstract":"<p><strong>Objective: </strong>To review a rare case of spontaneous regression of laryngeal amyloidosis and current management of the disease.</p><p><strong>Introduction: </strong>While laryngeal amyloidosis is rare (<1% of benign laryngeal lesions), it is the most common site of localized head and neck amyloidosis. The most common presenting symptoms include dysphonia and dyspnea. The mainstay of treatment is surgical excision. We report a rare case of spontaneously regressing laryngeal amyloidosis, which has not been previously reported in the literature to the best of the authors' knowledge.</p><p><strong>Methods: </strong>Single case study.</p><p><strong>Results: </strong>We report the case of an 82-year-old male recently diagnosed with rheumatoid arthritis on methotrexate who presented with 6 months of dysphonia and significant worsening over the past month. Flexible laryngoscopy revealed an irregular mass of the right true vocal cord with hypomobility. Given concern for possible malignancy, a CT scan was obtained, which revealed a multilobulated soft tissue mass of the right true vocal cord with 15 mm caudal extension into the subglottic airway. Pathology was consistent with amyloidosis. Hematology workup revealed no systemic involvement. Subsequent laryngoscopy two months later demonstrated spontaneous regression of 85% of the lesion and return of normal vocal cord mobility. Given significant improvement, surgical intervention was deferred for surveillance monitoring. Eight months after the initial presentation, the patient's dysphonia and lesion continue to regress.</p><p><strong>Conclusion: </strong>While laryngeal amyloidosis is rare (<1% of benign laryngeal lesions), it is the most common site of localized head and neck amyloidosis. We report a rare case of spontaneously regressing laryngeal amyloidosis, which has not been previously reported in the literature to the best of the authors' knowledge. This is a rare phenomenon as laryngeal amyloidosis typically requires surgical excision. As this case highlights, localized laryngeal amyloidosis has favorable outcomes, but diligent surveillance is required given the risk of recurrence.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Journal of VoicePub Date : 2025-09-02DOI: 10.1016/j.jvoice.2025.08.011
Aníbal J S Ferreira, Luis M T Jesus, Laurentino M M Leal, Jorge E F Spratley
{"title":"Accurate Analysis of the Pitch Pulse-Based Magnitude/Phase Structure of Natural Vowels and Assessment of Three Lightweight Time/Frequency Voicing Restoration Methods.","authors":"Aníbal J S Ferreira, Luis M T Jesus, Laurentino M M Leal, Jorge E F Spratley","doi":"10.1016/j.jvoice.2025.08.011","DOIUrl":"https://doi.org/10.1016/j.jvoice.2025.08.011","url":null,"abstract":"<p><p>This paper addresses two challenges that are intertwined and are key in informing signal processing methods restoring natural (voiced) speech from whispered speech. The first challenge involves characterizing and modeling the evolution of the harmonic phase/magnitude structure of a sequence of individual pitch periods in a voiced region of natural speech comprising sustained or co-articulated vowels. A novel algorithm segmenting individual pitch pulses is proposed, which is then used to obtain illustrative results highlighting important differences between sustained and co-articulated vowels, and suggesting practical synthetic voicing approaches. The second challenge involves model-based synthetic voicing restoration in real-time and on-the-fly. Three implementation alternatives are described that differ in their signal reconstruction approaches: frequency-domain, combined frequency- and time-domain, and physiologically inspired filtering of glottal excitation pulses individually generated. The three alternatives are compared objectively using illustrative examples, and subjectively using the results of listening tests involving synthetic voicing of sustained and co-articulated vowels in word context.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Journal of VoicePub Date : 2025-09-01DOI: 10.1016/j.jvoice.2023.03.012
Ganjun Liu , Tao Zhang , Biyun Ding , Ying Lv , Xiaohui Hou , Haoyang Guo , Yaqin Wu , Dehui Fu
{"title":"GBNF-VAE: A Pathological Voice Enhancement Model Based on Gold Section for Bottleneck Feature With Variational Autoencoder","authors":"Ganjun Liu , Tao Zhang , Biyun Ding , Ying Lv , Xiaohui Hou , Haoyang Guo , Yaqin Wu , Dehui Fu","doi":"10.1016/j.jvoice.2023.03.012","DOIUrl":"10.1016/j.jvoice.2023.03.012","url":null,"abstract":"<div><h3>Objective</h3><div>Speech enhancement has become a promising technique to accommodate demands of the improvement in quality of a degraded speech signal. The main works now focus on separating normal speech from noise, but have neglected the low quality of impaired speech influenced by anomalous glottis flow. In order to effectively enhance the pathological speech, it is essential to design a separation mechanism for extracting high-dimensional timbre features and speech features separately to suppress low-dimensional noises.</div></div><div><h3>Methods</h3><div>In this paper, we propose an enhancement model GBNF-VAE to extract timbre efficiently by reducing anomalous airflow noise interference, and by combining the semantic features with timbre features to synthesize the enhanced speech. In particular, the bottleneck feature can characterize the timbre by the controlled number of nodes through the Golden Section method, which effectively improves computational efficiency. In addition, variational autoencoder is adopted to extract semantic features which are combined with the previous timbre features to synthesize the enhanced speech.</div></div><div><h3>Results</h3><div>Finally, spectrum observation, objective indicators and subjective evaluation all show the outstanding performance of GBNF-VAE in pathological speech quality enhancement.</div></div>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":"39 5","pages":"Pages 1171-1182"},"PeriodicalIF":2.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9446654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Journal of VoicePub Date : 2025-09-01DOI: 10.1016/j.jvoice.2023.01.004
Christian T. Herbst , Brad H. Story , David Meyer
{"title":"Acoustical Theory of Vowel Modification Strategies in Belting","authors":"Christian T. Herbst , Brad H. Story , David Meyer","doi":"10.1016/j.jvoice.2023.01.004","DOIUrl":"10.1016/j.jvoice.2023.01.004","url":null,"abstract":"<div><div>Various authors have argued that belting is to be produced by “speech-like” sounds, with the first and second supraglottic vocal tract resonances (<span><math><msub><mi>f</mi><mrow><mi>R</mi><mn>1</mn></mrow></msub></math></span> and <span><math><msub><mi>f</mi><mrow><mi>R</mi><mn>2</mn></mrow></msub></math></span>) at frequencies of the vowels determined by the lyrics to be sung. Acoustically, the hallmark of belting has been identified as a dominant second harmonic, possibly enhanced by first resonance tuning (<span><math><mrow><msub><mi>f</mi><mrow><mi>R</mi><mn>1</mn></mrow></msub><mo>≈</mo><mn>2</mn><msub><mi>f</mi><mi>o</mi></msub></mrow></math></span>). It is not clear how both these concepts – (a) phonating with “speech-like,” unmodified vowels; and (b) producing a belting sound with a dominant second harmonic, typically enhanced by <span><math><msub><mi>f</mi><mrow><mi>R</mi><mn>1</mn></mrow></msub></math></span> – can be upheld when singing across a singer’s entire musical pitch range. For instance, anecdotal reports from pedagogues suggest that vowels with a low <span><math><msub><mi>f</mi><mrow><mi>R</mi><mn>1</mn></mrow></msub></math></span>, such as [i] or [u], might have to be modified considerably (by raising <span><math><msub><mi>f</mi><mrow><mi>R</mi><mn>1</mn></mrow></msub></math></span>) in order to phonate at higher pitches. These issues were systematically addressed <em>in silico</em> with respect to treble singing, using a linear source-filter voice production model. The dominant harmonic of the radiated spectrum was assessed in 12987 simulations, covering a parameter space of 37 fundamental frequencies (<span><math><msub><mi>f</mi><mi>o</mi></msub></math></span><span>) across the musical pitch range from C3 to C6; 27 voice source spectral slope settings from </span><span><math><mo>−</mo></math></span>4 to <span><math><mo>−</mo></math></span><span>30 dB/octave; computed for 13 different IPA vowels. The results suggest that, for most unmodified vowels, the stereotypical belting sound characteristics with a dominant second harmonic can only be produced over a pitch range of about a musical fifth, centered at </span><span><math><mrow><msub><mi>f</mi><mi>o</mi></msub><mo>≈</mo><mn>0.5</mn><msub><mi>f</mi><mrow><mi>R</mi><mn>1</mn></mrow></msub></mrow></math></span>. In the [ɔ] and [ɑ] vowels, that range is extended to an octave, supported by a low second resonance. Data aggregation – considering the relative prevalence of vowels in American English – suggests that, historically, belting with <span><math><mrow><msub><mi>f</mi><mrow><mi>R</mi><mn>1</mn></mrow></msub><mo>≈</mo><mn>2</mn><msub><mi>f</mi><mi>o</mi></msub></mrow></math></span> was derived from speech, and that songs with an extended musical pitch range likely demand considerable vowel modification. We thus argue that – on acoustical grounds – the pedagogical commandment for belting with unmodified, “speech-like” vowels can not always be fulfilled.</div></di","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":"39 5","pages":"Pages 1192-1204"},"PeriodicalIF":2.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9414485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}