人工失言检测，呃，不，是为大众生成失言

IF 3.4 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Speech and Language Pub Date : 2024-08-17 DOI:10.1016/j.csl.2024.101711

Tatiana Passali , Thanassis Mavropoulos , Grigorios Tsoumakas , Georgios Meditskos , Stefanos Vrochidis

{"title":"人工失言检测，呃，不，是为大众生成失言","authors":"Tatiana Passali , Thanassis Mavropoulos , Grigorios Tsoumakas , Georgios Meditskos , Stefanos Vrochidis","doi":"10.1016/j.csl.2024.101711","DOIUrl":null,"url":null,"abstract":"<div><p>Existing approaches for disfluency detection typically require the existence of large annotated datasets. However, current datasets for this task are limited, suffer from class imbalance, and lack some types of disfluencies that are encountered in real-world scenarios. At the same time, augmentation techniques for disfluency detection are not able to model complex types of disfluencies. This limits such approaches to only performing pre-training since the generated data are not indicative of disfluencies that occur in real scenarios and, as a result, cannot be directly used for training disfluency detection models, as we experimentally demonstrate. This imposes significant constraints on the usefulness of such approaches in practice since real disfluencies still have to be collected in order to train the models. In this work, we propose Large-scale ARtificial Disfluency Generation (LARD), a method for automatically generating artificial disfluencies, and more specifically repairs, from fluent text. Unlike existing augmentation techniques, LARD can simulate all the different and complex types of disfluencies. In addition, it incorporates contextual embeddings into the disfluency generation to produce realistic, context-aware artificial disfluencies. LARD can be used effectively for training disfluency detection models, bypassing the requirement of annotated disfluent data. Our empirical evaluation shows that LARD outperforms existing rule-based augmentation methods and increases the accuracy of existing disfluency detectors. In addition, experiments demonstrate that the proposed method can be effectively used in a low-resource setup.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"89 ","pages":"Article 101711"},"PeriodicalIF":3.4000,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000949/pdfft?md5=3e3442312f5819775b9ad09e131a9dd3&pid=1-s2.0-S0885230824000949-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Artificial disfluency detection, uh no, disfluency generation for the masses\",\"authors\":\"Tatiana Passali , Thanassis Mavropoulos , Grigorios Tsoumakas , Georgios Meditskos , Stefanos Vrochidis\",\"doi\":\"10.1016/j.csl.2024.101711\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Existing approaches for disfluency detection typically require the existence of large annotated datasets. However, current datasets for this task are limited, suffer from class imbalance, and lack some types of disfluencies that are encountered in real-world scenarios. At the same time, augmentation techniques for disfluency detection are not able to model complex types of disfluencies. This limits such approaches to only performing pre-training since the generated data are not indicative of disfluencies that occur in real scenarios and, as a result, cannot be directly used for training disfluency detection models, as we experimentally demonstrate. This imposes significant constraints on the usefulness of such approaches in practice since real disfluencies still have to be collected in order to train the models. In this work, we propose Large-scale ARtificial Disfluency Generation (LARD), a method for automatically generating artificial disfluencies, and more specifically repairs, from fluent text. Unlike existing augmentation techniques, LARD can simulate all the different and complex types of disfluencies. In addition, it incorporates contextual embeddings into the disfluency generation to produce realistic, context-aware artificial disfluencies. LARD can be used effectively for training disfluency detection models, bypassing the requirement of annotated disfluent data. Our empirical evaluation shows that LARD outperforms existing rule-based augmentation methods and increases the accuracy of existing disfluency detectors. In addition, experiments demonstrate that the proposed method can be effectively used in a low-resource setup.</p></div>\",\"PeriodicalId\":50638,\"journal\":{\"name\":\"Computer Speech and Language\",\"volume\":\"89 \",\"pages\":\"Article 101711\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0885230824000949/pdfft?md5=3e3442312f5819775b9ad09e131a9dd3&pid=1-s2.0-S0885230824000949-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Speech and Language\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0885230824000949\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000949","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

现有的不流畅语检测方法通常需要大型注释数据集。然而，目前用于这项任务的数据集很有限，存在类不平衡的问题，而且缺乏在真实世界场景中遇到的某些类型的不流畅。同时，用于不流畅检测的增强技术也无法对复杂类型的不流畅进行建模。这就限制了这些方法只能进行预训练，因为生成的数据并不能反映真实场景中出现的不流畅现象，因此不能直接用于训练不流畅检测模型，我们的实验证明了这一点。这对此类方法在实践中的实用性造成了很大的限制，因为要训练模型，还必须收集真实的不流利现象。在这项工作中，我们提出了大规模人工断句生成（LARD），这是一种从流畅文本中自动生成人工断句（更具体地说是修复）的方法。与现有的增强技术不同，LARD 可以模拟所有不同和复杂类型的不流畅语句。此外，它还将上下文嵌入到不流畅语句的生成过程中，从而生成逼真的、具有上下文感知能力的人工不流畅语句。LARD 可有效地用于训练不流利语检测模型，从而绕过了对注释不流利语数据的要求。我们的实证评估表明，LARD 优于现有的基于规则的增强方法，并提高了现有不流利语检测器的准确性。此外，实验还证明，所提出的方法可以在低资源环境下有效使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Artificial disfluency detection, uh no, disfluency generation for the masses

Existing approaches for disfluency detection typically require the existence of large annotated datasets. However, current datasets for this task are limited, suffer from class imbalance, and lack some types of disfluencies that are encountered in real-world scenarios. At the same time, augmentation techniques for disfluency detection are not able to model complex types of disfluencies. This limits such approaches to only performing pre-training since the generated data are not indicative of disfluencies that occur in real scenarios and, as a result, cannot be directly used for training disfluency detection models, as we experimentally demonstrate. This imposes significant constraints on the usefulness of such approaches in practice since real disfluencies still have to be collected in order to train the models. In this work, we propose Large-scale ARtificial Disfluency Generation (LARD), a method for automatically generating artificial disfluencies, and more specifically repairs, from fluent text. Unlike existing augmentation techniques, LARD can simulate all the different and complex types of disfluencies. In addition, it incorporates contextual embeddings into the disfluency generation to produce realistic, context-aware artificial disfluencies. LARD can be used effectively for training disfluency detection models, bypassing the requirement of annotated disfluent data. Our empirical evaluation shows that LARD outperforms existing rule-based augmentation methods and increases the accuracy of existing disfluency detectors. In addition, experiments demonstrate that the proposed method can be effectively used in a low-resource setup.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Speech and Language 工程技术-计算机：人工智能

CiteScore

11.30

自引率

4.70%

发文量

审稿时长

22.9 weeks

期刊介绍： Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.