John-Paul Hosom, Lawrence Shriberg, Jordan R Green
{"title":"应用自动语音识别(ASR)方法对儿童言语失用症的诊断评估。","authors":"John-Paul Hosom, Lawrence Shriberg, Jordan R Green","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>We report findings from two feasibility studies using automatic speech recognition (ASR) methods in childhood speech sound disorders. The studies evaluated and implemented the automation of two recently proposed diagnostic markers for suspected Apraxia of Speech (AOS) termed the Lexical Stress Ratio (LSR) and the Coefficient of Variation Ratio (CVR). The LSR is a weighted composite of amplitude area, frequency area , and duration in the stressed compared to the unstressed vowel as obtained from a speaker's productions of eight trochaic word forms. Composite weightings for the three stress parameters were determined from a principal components analysis. The CVR expresses the average normalized variability of durations of pause and speech events that were obtained from a conversational speech sample. We describe the automation procedures used to obtain LSR and CVR scores for four children with suspected AOS and report comparative findings. The LSR values obtained with ASR were within 1.2% to 6.7% of the LSR values obtained manually using Computerized Speech Lab (CSL). The CVR values obtained with ASR were within 0.7% to 2.7% of the CVR values obtained manually using Matlab. These results indicate the potential of ASR-based techniques to process these and other diagnostic markers of childhood speech sound disorders.</p>","PeriodicalId":50131,"journal":{"name":"Journal of medical speech-language pathology","volume":"12 4","pages":"167-171"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1622919/pdf/nihms2560.pdf","citationCount":"0","resultStr":"{\"title\":\"Diagnostic Assessment of Childhood Apraxia of Speech Using Automatic Speech Recognition (ASR) Methods.\",\"authors\":\"John-Paul Hosom, Lawrence Shriberg, Jordan R Green\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We report findings from two feasibility studies using automatic speech recognition (ASR) methods in childhood speech sound disorders. The studies evaluated and implemented the automation of two recently proposed diagnostic markers for suspected Apraxia of Speech (AOS) termed the Lexical Stress Ratio (LSR) and the Coefficient of Variation Ratio (CVR). The LSR is a weighted composite of amplitude area, frequency area , and duration in the stressed compared to the unstressed vowel as obtained from a speaker's productions of eight trochaic word forms. Composite weightings for the three stress parameters were determined from a principal components analysis. The CVR expresses the average normalized variability of durations of pause and speech events that were obtained from a conversational speech sample. We describe the automation procedures used to obtain LSR and CVR scores for four children with suspected AOS and report comparative findings. The LSR values obtained with ASR were within 1.2% to 6.7% of the LSR values obtained manually using Computerized Speech Lab (CSL). The CVR values obtained with ASR were within 0.7% to 2.7% of the CVR values obtained manually using Matlab. These results indicate the potential of ASR-based techniques to process these and other diagnostic markers of childhood speech sound disorders.</p>\",\"PeriodicalId\":50131,\"journal\":{\"name\":\"Journal of medical speech-language pathology\",\"volume\":\"12 4\",\"pages\":\"167-171\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1622919/pdf/nihms2560.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of medical speech-language pathology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of medical speech-language pathology","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Diagnostic Assessment of Childhood Apraxia of Speech Using Automatic Speech Recognition (ASR) Methods.
We report findings from two feasibility studies using automatic speech recognition (ASR) methods in childhood speech sound disorders. The studies evaluated and implemented the automation of two recently proposed diagnostic markers for suspected Apraxia of Speech (AOS) termed the Lexical Stress Ratio (LSR) and the Coefficient of Variation Ratio (CVR). The LSR is a weighted composite of amplitude area, frequency area , and duration in the stressed compared to the unstressed vowel as obtained from a speaker's productions of eight trochaic word forms. Composite weightings for the three stress parameters were determined from a principal components analysis. The CVR expresses the average normalized variability of durations of pause and speech events that were obtained from a conversational speech sample. We describe the automation procedures used to obtain LSR and CVR scores for four children with suspected AOS and report comparative findings. The LSR values obtained with ASR were within 1.2% to 6.7% of the LSR values obtained manually using Computerized Speech Lab (CSL). The CVR values obtained with ASR were within 0.7% to 2.7% of the CVR values obtained manually using Matlab. These results indicate the potential of ASR-based techniques to process these and other diagnostic markers of childhood speech sound disorders.