Kaila L Stipancic, Tyson S Barrett, Kris Tjaden, Stephanie A Borrie
{"title":"使用 Autoscore 对语音可懂度测试进行自动评分。","authors":"Kaila L Stipancic, Tyson S Barrett, Kris Tjaden, Stephanie A Borrie","doi":"10.1044/2024_AJSLP-24-00276","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The purpose of the current study was to develop and test extensions to Autoscore, an automated approach for scoring listener transcriptions against target stimuli, for scoring the Speech Intelligibility Test (SIT), a widely used test for quantifying intelligibility in individuals with dysarthria.</p><p><strong>Method: </strong>Three main extensions to Autoscore were created including a compound rule, a contractions rule, and a numbers rule. We used two sets of previously collected listener SIT transcripts (<i>N</i> = 4,642) from databases of dysarthric speakers to evaluate the accuracy of the Autoscore SIT extensions. A human scorer and SIT-extended Autoscore were used to score sentence transcripts in both data sets. Scoring performance was determined by (a) comparing Autoscore and human scores using intraclass correlations (ICCs) at individual sentence and speaker levels and (b) comparing SIT-extended Autoscore performance to the original Autoscore with ICCs.</p><p><strong>Results: </strong>At both the individual sentence and speaker levels, Autoscore and the human scorer were nearly identical for both Data Set 1 (ICC = .9922 and ICC = .9767, respectively) and Data Set 2 (ICC = .9934 and ICC = .9946, respectively). Where disagreements between Autoscore and a human scorer occurred, the differences were often small (i.e., within 1 or 2 points). Across the two data sets (<i>N</i> = 4,642 sentences), SIT-extended Autoscore rendered 510 disagreements with the human scorer (vs. 571 disagreements for the original Autoscore).</p><p><strong>Discussion: </strong>Overall, SIT-extended Autoscore performed as well as human scorers and substantially improved scoring accuracy relative to the original version of Autoscore. Coupled with the substantial time and effort saving provided by Autoscore, its utility has been strengthened by the extensions developed and tested here.</p>","PeriodicalId":49240,"journal":{"name":"American Journal of Speech-Language Pathology","volume":" ","pages":"1-12"},"PeriodicalIF":2.3000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Scoring of the Speech Intelligibility Test Using Autoscore.\",\"authors\":\"Kaila L Stipancic, Tyson S Barrett, Kris Tjaden, Stephanie A Borrie\",\"doi\":\"10.1044/2024_AJSLP-24-00276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>The purpose of the current study was to develop and test extensions to Autoscore, an automated approach for scoring listener transcriptions against target stimuli, for scoring the Speech Intelligibility Test (SIT), a widely used test for quantifying intelligibility in individuals with dysarthria.</p><p><strong>Method: </strong>Three main extensions to Autoscore were created including a compound rule, a contractions rule, and a numbers rule. We used two sets of previously collected listener SIT transcripts (<i>N</i> = 4,642) from databases of dysarthric speakers to evaluate the accuracy of the Autoscore SIT extensions. A human scorer and SIT-extended Autoscore were used to score sentence transcripts in both data sets. Scoring performance was determined by (a) comparing Autoscore and human scores using intraclass correlations (ICCs) at individual sentence and speaker levels and (b) comparing SIT-extended Autoscore performance to the original Autoscore with ICCs.</p><p><strong>Results: </strong>At both the individual sentence and speaker levels, Autoscore and the human scorer were nearly identical for both Data Set 1 (ICC = .9922 and ICC = .9767, respectively) and Data Set 2 (ICC = .9934 and ICC = .9946, respectively). Where disagreements between Autoscore and a human scorer occurred, the differences were often small (i.e., within 1 or 2 points). Across the two data sets (<i>N</i> = 4,642 sentences), SIT-extended Autoscore rendered 510 disagreements with the human scorer (vs. 571 disagreements for the original Autoscore).</p><p><strong>Discussion: </strong>Overall, SIT-extended Autoscore performed as well as human scorers and substantially improved scoring accuracy relative to the original version of Autoscore. Coupled with the substantial time and effort saving provided by Autoscore, its utility has been strengthened by the extensions developed and tested here.</p>\",\"PeriodicalId\":49240,\"journal\":{\"name\":\"American Journal of Speech-Language Pathology\",\"volume\":\" \",\"pages\":\"1-12\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Speech-Language Pathology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1044/2024_AJSLP-24-00276\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Speech-Language Pathology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1044/2024_AJSLP-24-00276","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
Automated Scoring of the Speech Intelligibility Test Using Autoscore.
Purpose: The purpose of the current study was to develop and test extensions to Autoscore, an automated approach for scoring listener transcriptions against target stimuli, for scoring the Speech Intelligibility Test (SIT), a widely used test for quantifying intelligibility in individuals with dysarthria.
Method: Three main extensions to Autoscore were created including a compound rule, a contractions rule, and a numbers rule. We used two sets of previously collected listener SIT transcripts (N = 4,642) from databases of dysarthric speakers to evaluate the accuracy of the Autoscore SIT extensions. A human scorer and SIT-extended Autoscore were used to score sentence transcripts in both data sets. Scoring performance was determined by (a) comparing Autoscore and human scores using intraclass correlations (ICCs) at individual sentence and speaker levels and (b) comparing SIT-extended Autoscore performance to the original Autoscore with ICCs.
Results: At both the individual sentence and speaker levels, Autoscore and the human scorer were nearly identical for both Data Set 1 (ICC = .9922 and ICC = .9767, respectively) and Data Set 2 (ICC = .9934 and ICC = .9946, respectively). Where disagreements between Autoscore and a human scorer occurred, the differences were often small (i.e., within 1 or 2 points). Across the two data sets (N = 4,642 sentences), SIT-extended Autoscore rendered 510 disagreements with the human scorer (vs. 571 disagreements for the original Autoscore).
Discussion: Overall, SIT-extended Autoscore performed as well as human scorers and substantially improved scoring accuracy relative to the original version of Autoscore. Coupled with the substantial time and effort saving provided by Autoscore, its utility has been strengthened by the extensions developed and tested here.
期刊介绍:
Mission: AJSLP publishes peer-reviewed research and other scholarly articles on all aspects of clinical practice in speech-language pathology. The journal is an international outlet for clinical research pertaining to screening, detection, diagnosis, management, and outcomes of communication and swallowing disorders across the lifespan as well as the etiologies and characteristics of these disorders. Because of its clinical orientation, the journal disseminates research findings applicable to diverse aspects of clinical practice in speech-language pathology. AJSLP seeks to advance evidence-based practice by disseminating the results of new studies as well as providing a forum for critical reviews and meta-analyses of previously published work.
Scope: The broad field of speech-language pathology, including aphasia; apraxia of speech and childhood apraxia of speech; aural rehabilitation; augmentative and alternative communication; cognitive impairment; craniofacial disorders; dysarthria; fluency disorders; language disorders in children; speech sound disorders; swallowing, dysphagia, and feeding disorders; and voice disorders.