Saskia Ibelings, Thomas Brand, Esther Ruigendijk, Inga Holube
{"title":"利用合成语音开发基于短语的语音识别测试。","authors":"Saskia Ibelings, Thomas Brand, Esther Ruigendijk, Inga Holube","doi":"10.1177/23312165241261490","DOIUrl":null,"url":null,"abstract":"<p><p>Speech-recognition tests are widely used in both clinical and research audiology. The purpose of this study was the development of a novel speech-recognition test that combines concepts of different speech-recognition tests to reduce training effects and allows for a large set of speech material. The new test consists of four different words per trial in a meaningful construct with a fixed structure, the so-called phrases. Various free databases were used to select the words and to determine their frequency. Highly frequent nouns were grouped into thematic categories and combined with related adjectives and infinitives. After discarding inappropriate and unnatural combinations, and eliminating duplications of (sub-)phrases, a total number of 772 phrases remained. Subsequently, the phrases were synthesized using a text-to-speech system. The synthesis significantly reduces the effort compared to recordings with a real speaker. After excluding outliers, measured speech-recognition scores for the phrases with 31 normal-hearing participants at fixed signal-to-noise ratios (SNR) revealed speech-recognition thresholds (SRT) for each phrase varying up to 4 dB. The median SRT was -9.1 dB SNR and thus comparable to existing sentence tests. The psychometric function's slope of 15 percentage points per dB is also comparable and enables efficient use in audiology. Summarizing, the principle of creating speech material in a modular system has many potential applications.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241261490"},"PeriodicalIF":2.6000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11273571/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development of a Phrase-Based Speech-Recognition Test Using Synthetic Speech.\",\"authors\":\"Saskia Ibelings, Thomas Brand, Esther Ruigendijk, Inga Holube\",\"doi\":\"10.1177/23312165241261490\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Speech-recognition tests are widely used in both clinical and research audiology. The purpose of this study was the development of a novel speech-recognition test that combines concepts of different speech-recognition tests to reduce training effects and allows for a large set of speech material. The new test consists of four different words per trial in a meaningful construct with a fixed structure, the so-called phrases. Various free databases were used to select the words and to determine their frequency. Highly frequent nouns were grouped into thematic categories and combined with related adjectives and infinitives. After discarding inappropriate and unnatural combinations, and eliminating duplications of (sub-)phrases, a total number of 772 phrases remained. Subsequently, the phrases were synthesized using a text-to-speech system. The synthesis significantly reduces the effort compared to recordings with a real speaker. After excluding outliers, measured speech-recognition scores for the phrases with 31 normal-hearing participants at fixed signal-to-noise ratios (SNR) revealed speech-recognition thresholds (SRT) for each phrase varying up to 4 dB. The median SRT was -9.1 dB SNR and thus comparable to existing sentence tests. The psychometric function's slope of 15 percentage points per dB is also comparable and enables efficient use in audiology. Summarizing, the principle of creating speech material in a modular system has many potential applications.</p>\",\"PeriodicalId\":48678,\"journal\":{\"name\":\"Trends in Hearing\",\"volume\":\"28 \",\"pages\":\"23312165241261490\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11273571/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Trends in Hearing\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/23312165241261490\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Hearing","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/23312165241261490","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
Development of a Phrase-Based Speech-Recognition Test Using Synthetic Speech.
Speech-recognition tests are widely used in both clinical and research audiology. The purpose of this study was the development of a novel speech-recognition test that combines concepts of different speech-recognition tests to reduce training effects and allows for a large set of speech material. The new test consists of four different words per trial in a meaningful construct with a fixed structure, the so-called phrases. Various free databases were used to select the words and to determine their frequency. Highly frequent nouns were grouped into thematic categories and combined with related adjectives and infinitives. After discarding inappropriate and unnatural combinations, and eliminating duplications of (sub-)phrases, a total number of 772 phrases remained. Subsequently, the phrases were synthesized using a text-to-speech system. The synthesis significantly reduces the effort compared to recordings with a real speaker. After excluding outliers, measured speech-recognition scores for the phrases with 31 normal-hearing participants at fixed signal-to-noise ratios (SNR) revealed speech-recognition thresholds (SRT) for each phrase varying up to 4 dB. The median SRT was -9.1 dB SNR and thus comparable to existing sentence tests. The psychometric function's slope of 15 percentage points per dB is also comparable and enables efficient use in audiology. Summarizing, the principle of creating speech material in a modular system has many potential applications.
Trends in HearingAUDIOLOGY & SPEECH-LANGUAGE PATHOLOGYOTORH-OTORHINOLARYNGOLOGY
CiteScore
4.50
自引率
11.10%
发文量
44
审稿时长
12 weeks
期刊介绍:
Trends in Hearing is an open access journal completely dedicated to publishing original research and reviews focusing on human hearing, hearing loss, hearing aids, auditory implants, and aural rehabilitation. Under its former name, Trends in Amplification, the journal established itself as a forum for concise explorations of all areas of translational hearing research by leaders in the field. Trends in Hearing has now expanded its focus to include original research articles, with the goal of becoming the premier venue for research related to human hearing and hearing loss.