基于机器学习的时间特征标注的有效性评估，以自动测量语音流畅性

Research Methods in Applied Linguistics Pub Date : 2024-12-31 DOI:10.1016/j.rmal.2024.100177

Ryuki Matsuura , Shungo Suzuki , Kotaro Takizawa , Mao Saeki , Yoichi Matsuyama

{"title":"基于机器学习的时间特征标注的有效性评估，以自动测量语音流畅性","authors":"Ryuki Matsuura , Shungo Suzuki , Kotaro Takizawa , Mao Saeki , Yoichi Matsuyama","doi":"10.1016/j.rmal.2024.100177","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning (ML) techniques allow for automatically annotating various temporal speech features, particularly by the cascade connection of ML-based modules. Although such systems are expected to enhance scalability of second language (L2) speech research, their annotation accuracy is potentially moderated by speaking tasks and proficiency levels due to the mismatch between training and real-world data. Accordingly, we developed and validated an ML-based temporal feature annotation system on L2 English datasets split by speaking tasks (monologic vs. dialogic tasks) and proficiency levels, operationalized as overall fluency levels (low, mid vs. high). We compared the annotations by experts and the system in terms of the agreement between manual and automatic annotations, correlations between manual and automatic measures, and the predictive power for listener-based fluency judgments. Results showed a substantial degree of agreement in the annotations for monologic tasks and a general tendency of strong correlations between manual and automatic measures regardless of tasks and overall fluency levels. Furthermore, automatic measures yielded substantial predictive power of fluency scores in monologic tasks. These findings suggest the substantial applicability of ML-based annotation systems to monologic tasks possibly without biases by holistic levels of fluency.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 1","pages":"Article 100177"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gauging the validity of machine learning-based temporal feature annotation to measure fluency in speech automatically\",\"authors\":\"Ryuki Matsuura , Shungo Suzuki , Kotaro Takizawa , Mao Saeki , Yoichi Matsuyama\",\"doi\":\"10.1016/j.rmal.2024.100177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Machine learning (ML) techniques allow for automatically annotating various temporal speech features, particularly by the cascade connection of ML-based modules. Although such systems are expected to enhance scalability of second language (L2) speech research, their annotation accuracy is potentially moderated by speaking tasks and proficiency levels due to the mismatch between training and real-world data. Accordingly, we developed and validated an ML-based temporal feature annotation system on L2 English datasets split by speaking tasks (monologic vs. dialogic tasks) and proficiency levels, operationalized as overall fluency levels (low, mid vs. high). We compared the annotations by experts and the system in terms of the agreement between manual and automatic annotations, correlations between manual and automatic measures, and the predictive power for listener-based fluency judgments. Results showed a substantial degree of agreement in the annotations for monologic tasks and a general tendency of strong correlations between manual and automatic measures regardless of tasks and overall fluency levels. Furthermore, automatic measures yielded substantial predictive power of fluency scores in monologic tasks. These findings suggest the substantial applicability of ML-based annotation systems to monologic tasks possibly without biases by holistic levels of fluency.</div></div>\",\"PeriodicalId\":101075,\"journal\":{\"name\":\"Research Methods in Applied Linguistics\",\"volume\":\"4 1\",\"pages\":\"Article 100177\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research Methods in Applied Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772766124000831\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Methods in Applied Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772766124000831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

机器学习（ML）技术允许自动注释各种时间语音特征，特别是通过基于ML的模块的级联连接。尽管这样的系统有望提高第二语言（L2）语音研究的可扩展性，但由于训练和现实世界数据之间的不匹配，它们的注释准确性可能会受到口语任务和熟练程度的影响。因此，我们在L2英语数据集上开发并验证了一个基于ml的时态特征标注系统，该系统按口语任务（单句任务与对话任务）和熟练程度划分，并按总体流利程度（低、中、高）进行操作。我们从人工和自动标注的一致性、人工和自动度量之间的相关性以及基于听者的流利度判断的预测能力等方面比较了专家和系统的标注。结果显示，在单一任务的注释中有相当程度的一致性，并且无论任务和总体流畅程度如何，手动和自动测量之间都有很强的相关性。此外，自动测量对单个任务的流畅性得分产生了实质性的预测能力。这些发现表明，基于ml的注释系统对单一任务的适用性可能不会受到整体流畅程度的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Gauging the validity of machine learning-based temporal feature annotation to measure fluency in speech automatically

Machine learning (ML) techniques allow for automatically annotating various temporal speech features, particularly by the cascade connection of ML-based modules. Although such systems are expected to enhance scalability of second language (L2) speech research, their annotation accuracy is potentially moderated by speaking tasks and proficiency levels due to the mismatch between training and real-world data. Accordingly, we developed and validated an ML-based temporal feature annotation system on L2 English datasets split by speaking tasks (monologic vs. dialogic tasks) and proficiency levels, operationalized as overall fluency levels (low, mid vs. high). We compared the annotations by experts and the system in terms of the agreement between manual and automatic annotations, correlations between manual and automatic measures, and the predictive power for listener-based fluency judgments. Results showed a substantial degree of agreement in the annotations for monologic tasks and a general tendency of strong correlations between manual and automatic measures regardless of tasks and overall fluency levels. Furthermore, automatic measures yielded substantial predictive power of fluency scores in monologic tasks. These findings suggest the substantial applicability of ML-based annotation systems to monologic tasks possibly without biases by holistic levels of fluency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Research Methods in Applied Linguistics

CiteScore

4.10

自引率

0.00%

发文量