{"title":"人工失言检测,呃,不,是为大众生成失言","authors":"Tatiana Passali , Thanassis Mavropoulos , Grigorios Tsoumakas , Georgios Meditskos , Stefanos Vrochidis","doi":"10.1016/j.csl.2024.101711","DOIUrl":null,"url":null,"abstract":"<div><p>Existing approaches for disfluency detection typically require the existence of large annotated datasets. However, current datasets for this task are limited, suffer from class imbalance, and lack some types of disfluencies that are encountered in real-world scenarios. At the same time, augmentation techniques for disfluency detection are not able to model complex types of disfluencies. This limits such approaches to only performing pre-training since the generated data are not indicative of disfluencies that occur in real scenarios and, as a result, cannot be directly used for training disfluency detection models, as we experimentally demonstrate. This imposes significant constraints on the usefulness of such approaches in practice since real disfluencies still have to be collected in order to train the models. In this work, we propose Large-scale ARtificial Disfluency Generation (LARD), a method for automatically generating artificial disfluencies, and more specifically repairs, from fluent text. Unlike existing augmentation techniques, LARD can simulate all the different and complex types of disfluencies. In addition, it incorporates contextual embeddings into the disfluency generation to produce realistic, context-aware artificial disfluencies. LARD can be used effectively for training disfluency detection models, bypassing the requirement of annotated disfluent data. Our empirical evaluation shows that LARD outperforms existing rule-based augmentation methods and increases the accuracy of existing disfluency detectors. In addition, experiments demonstrate that the proposed method can be effectively used in a low-resource setup.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000949/pdfft?md5=3e3442312f5819775b9ad09e131a9dd3&pid=1-s2.0-S0885230824000949-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Artificial disfluency detection, uh no, disfluency generation for the masses\",\"authors\":\"Tatiana Passali , Thanassis Mavropoulos , Grigorios Tsoumakas , Georgios Meditskos , Stefanos Vrochidis\",\"doi\":\"10.1016/j.csl.2024.101711\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Existing approaches for disfluency detection typically require the existence of large annotated datasets. However, current datasets for this task are limited, suffer from class imbalance, and lack some types of disfluencies that are encountered in real-world scenarios. At the same time, augmentation techniques for disfluency detection are not able to model complex types of disfluencies. This limits such approaches to only performing pre-training since the generated data are not indicative of disfluencies that occur in real scenarios and, as a result, cannot be directly used for training disfluency detection models, as we experimentally demonstrate. This imposes significant constraints on the usefulness of such approaches in practice since real disfluencies still have to be collected in order to train the models. In this work, we propose Large-scale ARtificial Disfluency Generation (LARD), a method for automatically generating artificial disfluencies, and more specifically repairs, from fluent text. Unlike existing augmentation techniques, LARD can simulate all the different and complex types of disfluencies. In addition, it incorporates contextual embeddings into the disfluency generation to produce realistic, context-aware artificial disfluencies. LARD can be used effectively for training disfluency detection models, bypassing the requirement of annotated disfluent data. Our empirical evaluation shows that LARD outperforms existing rule-based augmentation methods and increases the accuracy of existing disfluency detectors. In addition, experiments demonstrate that the proposed method can be effectively used in a low-resource setup.</p></div>\",\"PeriodicalId\":50638,\"journal\":{\"name\":\"Computer Speech and Language\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0885230824000949/pdfft?md5=3e3442312f5819775b9ad09e131a9dd3&pid=1-s2.0-S0885230824000949-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Speech and Language\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0885230824000949\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000949","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Artificial disfluency detection, uh no, disfluency generation for the masses
Existing approaches for disfluency detection typically require the existence of large annotated datasets. However, current datasets for this task are limited, suffer from class imbalance, and lack some types of disfluencies that are encountered in real-world scenarios. At the same time, augmentation techniques for disfluency detection are not able to model complex types of disfluencies. This limits such approaches to only performing pre-training since the generated data are not indicative of disfluencies that occur in real scenarios and, as a result, cannot be directly used for training disfluency detection models, as we experimentally demonstrate. This imposes significant constraints on the usefulness of such approaches in practice since real disfluencies still have to be collected in order to train the models. In this work, we propose Large-scale ARtificial Disfluency Generation (LARD), a method for automatically generating artificial disfluencies, and more specifically repairs, from fluent text. Unlike existing augmentation techniques, LARD can simulate all the different and complex types of disfluencies. In addition, it incorporates contextual embeddings into the disfluency generation to produce realistic, context-aware artificial disfluencies. LARD can be used effectively for training disfluency detection models, bypassing the requirement of annotated disfluent data. Our empirical evaluation shows that LARD outperforms existing rule-based augmentation methods and increases the accuracy of existing disfluency detectors. In addition, experiments demonstrate that the proposed method can be effectively used in a low-resource setup.
期刊介绍:
Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language.
The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.