使用 Autoscore 对语音可懂度测试进行自动评分。

IF 2.5 3区医学 Q1 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

American Journal of Speech-Language Pathology Pub Date : 2025-07-29 Epub Date: 2024-12-12 DOI:10.1044/2024_AJSLP-24-00276

Kaila L Stipancic, Tyson S Barrett, Kris Tjaden, Stephanie A Borrie

{"title":"使用 Autoscore 对语音可懂度测试进行自动评分。","authors":"Kaila L Stipancic, Tyson S Barrett, Kris Tjaden, Stephanie A Borrie","doi":"10.1044/2024_AJSLP-24-00276","DOIUrl":null,"url":null,"abstract":"Purpose: The purpose of the current study was to develop and test extensions to Autoscore, an automated approach for scoring listener transcriptions against target stimuli, for scoring the Speech Intelligibility Test (SIT), a widely used test for quantifying intelligibility in individuals with dysarthria.Method: Three main extensions to Autoscore were created including a compound rule, a contractions rule, and a numbers rule. We used two sets of previously collected listener SIT transcripts (N = 4,642) from databases of dysarthric speakers to evaluate the accuracy of the Autoscore SIT extensions. A human scorer and SIT-extended Autoscore were used to score sentence transcripts in both data sets. Scoring performance was determined by (a) comparing Autoscore and human scores using intraclass correlations (ICCs) at individual sentence and speaker levels and (b) comparing SIT-extended Autoscore performance to the original Autoscore with ICCs.Results: At both the individual sentence and speaker levels, Autoscore and the human scorer were nearly identical for both Data Set 1 (ICC = .9922 and ICC = .9767, respectively) and Data Set 2 (ICC = .9934 and ICC = .9946, respectively). Where disagreements between Autoscore and a human scorer occurred, the differences were often small (i.e., within 1 or 2 points). Across the two data sets (N = 4,642 sentences), SIT-extended Autoscore rendered 510 disagreements with the human scorer (vs. 571 disagreements for the original Autoscore).Discussion: Overall, SIT-extended Autoscore performed as well as human scorers and substantially improved scoring accuracy relative to the original version of Autoscore. Coupled with the substantial time and effort saving provided by Autoscore, its utility has been strengthened by the extensions developed and tested here.","PeriodicalId":49240,"journal":{"name":"American Journal of Speech-Language Pathology","volume":" ","pages":"2397-2408"},"PeriodicalIF":2.5000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12334294/pdf/","citationCount":"0","resultStr":"{\"title\":\"Automated Scoring of the Speech Intelligibility Test Using Autoscore.\",\"authors\":\"Kaila L Stipancic, Tyson S Barrett, Kris Tjaden, Stephanie A Borrie\",\"doi\":\"10.1044/2024_AJSLP-24-00276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: The purpose of the current study was to develop and test extensions to Autoscore, an automated approach for scoring listener transcriptions against target stimuli, for scoring the Speech Intelligibility Test (SIT), a widely used test for quantifying intelligibility in individuals with dysarthria.Method: Three main extensions to Autoscore were created including a compound rule, a contractions rule, and a numbers rule. We used two sets of previously collected listener SIT transcripts (N = 4,642) from databases of dysarthric speakers to evaluate the accuracy of the Autoscore SIT extensions. A human scorer and SIT-extended Autoscore were used to score sentence transcripts in both data sets. Scoring performance was determined by (a) comparing Autoscore and human scores using intraclass correlations (ICCs) at individual sentence and speaker levels and (b) comparing SIT-extended Autoscore performance to the original Autoscore with ICCs.Results: At both the individual sentence and speaker levels, Autoscore and the human scorer were nearly identical for both Data Set 1 (ICC = .9922 and ICC = .9767, respectively) and Data Set 2 (ICC = .9934 and ICC = .9946, respectively). Where disagreements between Autoscore and a human scorer occurred, the differences were often small (i.e., within 1 or 2 points). Across the two data sets (N = 4,642 sentences), SIT-extended Autoscore rendered 510 disagreements with the human scorer (vs. 571 disagreements for the original Autoscore).Discussion: Overall, SIT-extended Autoscore performed as well as human scorers and substantially improved scoring accuracy relative to the original version of Autoscore. Coupled with the substantial time and effort saving provided by Autoscore, its utility has been strengthened by the extensions developed and tested here.\",\"PeriodicalId\":49240,\"journal\":{\"name\":\"American Journal of Speech-Language Pathology\",\"volume\":\" \",\"pages\":\"2397-2408\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12334294/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Speech-Language Pathology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1044/2024_AJSLP-24-00276\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Speech-Language Pathology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1044/2024_AJSLP-24-00276","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/12 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

目的：当前研究的目的是开发和测试Autoscore的扩展，Autoscore是一种针对目标刺激对听者转录进行评分的自动方法，用于对语音清晰度测试（SIT）进行评分，这是一种广泛使用的用于量化构音障碍个体的可理解性的测试。方法：创建了Autoscore的三个主要扩展，包括复合规则、缩写规则和数字规则。我们使用了两组先前收集的听者SIT成绩单（N = 4,642），这些成绩单来自发音困难的说话者的数据库，以评估Autoscore SIT扩展的准确性。使用人工评分员和sit扩展的Autoscore对两个数据集中的句子转录进行评分。评分表现通过(a)在单个句子和说话者水平上使用类内相关性（icc）比较Autoscore和人类得分，以及(b)将sat扩展的Autoscore性能与使用icc的原始Autoscore性能进行比较来确定。结果：在单个句子和说话人的水平上，Autoscore和人类评分者在数据集1 （ICC = .9922和ICC = .9767，分别）和数据集2 （ICC = .9934和ICC = .9946，分别）上几乎相同。当Autoscore和人类评分员之间出现分歧时，差异通常很小（即在1或2分之内）。在两个数据集（N = 4,642个句子）中，sit扩展的Autoscore与人工评分者产生了510个不一致（原始Autoscore为571个不一致）。讨论：总体而言，sit扩展后的Autoscore表现得和人类评分者一样好，相对于原始版本的Autoscore，评分准确性得到了大幅提高。再加上Autoscore提供的大量时间和精力节省，它的实用性已经通过这里开发和测试的扩展得到了加强。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated Scoring of the Speech Intelligibility Test Using Autoscore.

Purpose: The purpose of the current study was to develop and test extensions to Autoscore, an automated approach for scoring listener transcriptions against target stimuli, for scoring the Speech Intelligibility Test (SIT), a widely used test for quantifying intelligibility in individuals with dysarthria.

Method: Three main extensions to Autoscore were created including a compound rule, a contractions rule, and a numbers rule. We used two sets of previously collected listener SIT transcripts (N = 4,642) from databases of dysarthric speakers to evaluate the accuracy of the Autoscore SIT extensions. A human scorer and SIT-extended Autoscore were used to score sentence transcripts in both data sets. Scoring performance was determined by (a) comparing Autoscore and human scores using intraclass correlations (ICCs) at individual sentence and speaker levels and (b) comparing SIT-extended Autoscore performance to the original Autoscore with ICCs.

Results: At both the individual sentence and speaker levels, Autoscore and the human scorer were nearly identical for both Data Set 1 (ICC = .9922 and ICC = .9767, respectively) and Data Set 2 (ICC = .9934 and ICC = .9946, respectively). Where disagreements between Autoscore and a human scorer occurred, the differences were often small (i.e., within 1 or 2 points). Across the two data sets (N = 4,642 sentences), SIT-extended Autoscore rendered 510 disagreements with the human scorer (vs. 571 disagreements for the original Autoscore).

Discussion: Overall, SIT-extended Autoscore performed as well as human scorers and substantially improved scoring accuracy relative to the original version of Autoscore. Coupled with the substantial time and effort saving provided by Autoscore, its utility has been strengthened by the extensions developed and tested here.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

American Journal of Speech-Language Pathology AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY-REHABILITATION

CiteScore

4.30

自引率

11.50%

发文量

353

审稿时长

>12 weeks

期刊介绍： Mission: AJSLP publishes peer-reviewed research and other scholarly articles on all aspects of clinical practice in speech-language pathology. The journal is an international outlet for clinical research pertaining to screening, detection, diagnosis, management, and outcomes of communication and swallowing disorders across the lifespan as well as the etiologies and characteristics of these disorders. Because of its clinical orientation, the journal disseminates research findings applicable to diverse aspects of clinical practice in speech-language pathology. AJSLP seeks to advance evidence-based practice by disseminating the results of new studies as well as providing a forum for critical reviews and meta-analyses of previously published work. Scope: The broad field of speech-language pathology, including aphasia; apraxia of speech and childhood apraxia of speech; aural rehabilitation; augmentative and alternative communication; cognitive impairment; craniofacial disorders; dysarthria; fluency disorders; language disorders in children; speech sound disorders; swallowing, dysphagia, and feeding disorders; and voice disorders.