Computational Linguistics最新文献

筛选

英文中文

Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model 基于小数据集的神经数据到文本生成:在大型语言模型上比较两种半监督学习方法的附加价值

2区计算机科学

Computational Linguistics Pub Date : 2023-08-10 DOI: 10.1162/coli_a_00484

Chris van der Lee, Thiago Castro Ferreira, Chris Emmery, Travis J. Wiltshire, Emiel Krahmer

{"title":"Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model","authors":"Chris van der Lee, Thiago Castro Ferreira, Chris Emmery, Travis J. Wiltshire, Emiel Krahmer","doi":"10.1162/coli_a_00484","DOIUrl":"https://doi.org/10.1162/coli_a_00484","url":null,"abstract":"Abstract This study discusses the effect of semi-supervised learning in combination with pretrained language models for data-to-text generation. It is not known whether semi-supervised learning is still helpful when a large-scale language model is also supplemented. This study aims to answer this question by comparing a data-to-text system only supplemented with a language model, to two data-to-text systems that are additionally enriched by a data augmentation or a pseudo-labeling semi-supervised learning approach. Results show that semi-supervised learning results in higher scores on diversity metrics. In terms of output quality, extending the training set of a data-to-text system with a language model using the pseudo-labeling approach did increase text quality scores, but the data augmentation approach yielded similar scores to the system without training set extension. These results indicate that semi-supervised learning approaches can bolster output quality and diversity, even when a language model is also present.","PeriodicalId":49089,"journal":{"name":"Computational Linguistics","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135492965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measuring Attribution in Natural Language Generation Models 自然语言生成模型中的归因测量

IF 9.3 2区计算机科学

Computational Linguistics Pub Date : 2023-07-07 DOI: 10.1162/coli_a_00490

Hannah Rashkin, Vitaly Nikolaev, Matthew Lamm, Lora Aroyo, Michael Collins, Dipanjan Das, Slav Petrov, Gaurav Singh Tomar, Iulia Turc, David Reitter

引用次数: 0

首页上一页