AACL BiofluxPub Date : 2022-11-22DOI: 10.48550/arXiv.2211.12118
Seonil Son, Junsoo Park, J. Hwang, Junghwa Lee, Hyungjong Noh, Yeonsoo Lee
{"title":"HaRiM^+: Evaluating Summary Quality with Hallucination Risk","authors":"Seonil Son, Junsoo Park, J. Hwang, Junghwa Lee, Hyungjong Noh, Yeonsoo Lee","doi":"10.48550/arXiv.2211.12118","DOIUrl":"https://doi.org/10.48550/arXiv.2211.12118","url":null,"abstract":"One of the challenges of developing a summarization model arises from the difficulty in measuring the factual inconsistency of the generated text. In this study, we reinterpret the decoder overconfidence-regularizing objective suggested in (Miao et al., 2021) as a hallucination risk measurement to better estimate the quality of generated summaries. We propose a reference-free metric, HaRiM+, which only requires an off-the-shelf summarization model to compute the hallucination risk based on token likelihoods. Deploying it requires no additional training of models or ad-hoc modules, which usually need alignment to human judgments. For summary-quality estimation, HaRiM+ records state-of-the-art correlation to human judgment on three summary-quality annotation sets: FRANK, QAGS, and SummEval. We hope that our work, which merits the use of summarization models, facilitates the progress of both automated evaluation and generation of summary.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"52 1","pages":"895-924"},"PeriodicalIF":0.0,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88563113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AACL BiofluxPub Date : 2022-11-22DOI: 10.48550/arXiv.2211.12157
Alapan Kuila, Sudeshan Sarkar
{"title":"PESE: Event Structure Extraction using Pointer Network based Encoder-Decoder Architecture","authors":"Alapan Kuila, Sudeshan Sarkar","doi":"10.48550/arXiv.2211.12157","DOIUrl":"https://doi.org/10.48550/arXiv.2211.12157","url":null,"abstract":"The task of event extraction (EE) aims to find the events and event-related argument information from the text and represent them in a structured format. Most previous works try to solve the problem by separately identifying multiple substructures and aggregating them to get the complete event structure. The problem with the methods is that it fails to identify all the interdependencies among the event participants (event-triggers, arguments, and roles). In this paper, we represent each event record in a unique tuple format that contains trigger phrase, trigger type, argument phrase, and corresponding role information. Our proposed pointer network-based encoder-decoder model generates an event tuple in each time step by exploiting the interactions among event participants and presenting a truly end-to-end solution to the EE task. We evaluate our model on the ACE2005 dataset, and experimental results demonstrate the effectiveness of our model by achieving competitive performance compared to the state-of-the-art methods.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"1 1","pages":"1091-1100"},"PeriodicalIF":0.0,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88683690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bipartite-play Dialogue Collection for Practical Automatic Evaluation of Dialogue Systems","authors":"Shiki Sato, Yosuke Kishinami, Hiroaki Sugiyama, Reina Akama, Ryoko Tokuhisa, Jun Suzuki","doi":"10.48550/arXiv.2211.10596","DOIUrl":"https://doi.org/10.48550/arXiv.2211.10596","url":null,"abstract":"Automation of dialogue system evaluation is a driving force for the efficient development of dialogue systems. This paper introduces the bipartite-play method, a dialogue collection method for automating dialogue system evaluation. It addresses the limitations of existing dialogue collection methods: (i) inability to compare with systems that are not publicly available, and (ii) vulnerability to cheating by intentionally selecting systems to be compared. Experimental results show that the automatic evaluation using the bipartite-play method mitigates these two drawbacks and correlates as strongly with human subjectivity as existing methods.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"189 1","pages":"8-16"},"PeriodicalIF":0.0,"publicationDate":"2022-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76006181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AACL BiofluxPub Date : 2022-11-09DOI: 10.48550/arXiv.2211.05025
Louis Clouâtre, Prasanna Parthasarathi, A. Zouaq, Sarath Chandar
{"title":"Local Structure Matters Most in Most Languages","authors":"Louis Clouâtre, Prasanna Parthasarathi, A. Zouaq, Sarath Chandar","doi":"10.48550/arXiv.2211.05025","DOIUrl":"https://doi.org/10.48550/arXiv.2211.05025","url":null,"abstract":"Many recent perturbation studies have found unintuitive results on what does and does not matter when performing Natural Language Understanding (NLU) tasks in English. Coding properties, such as the order of words, can often be removed through shuffling without impacting downstream performances. Such insight may be used to direct future research into English NLP models. As many improvements in multilingual settings consist of wholesale adaptation of English approaches, it is important to verify whether those studies replicate or not in multilingual settings. In this work, we replicate a study on the importance of local structure, and the relative unimportance of global structure, in a multilingual setting. We find that the phenomenon observed on the English language broadly translates to over 120 languages, with a few caveats.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"57 1","pages":"285-294"},"PeriodicalIF":0.0,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86769125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AACL BiofluxPub Date : 2022-11-08DOI: 10.48550/arXiv.2211.03988
Hiroki Iida, Naoaki Okazaki
{"title":"Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps","authors":"Hiroki Iida, Naoaki Okazaki","doi":"10.48550/arXiv.2211.03988","DOIUrl":"https://doi.org/10.48550/arXiv.2211.03988","url":null,"abstract":"IR models using a pretrained language model significantly outperform lexical approaches like BM25. In particular, SPLADE, which encodes texts to sparse vectors, is an effective model for practical use because it shows robustness to out-of-domain datasets. However, SPLADE still struggles with exact matching of low-frequency words in training data. In addition, domain shifts in vocabulary and word frequencies deteriorate the IR performance of SPLADE. Because supervision data are scarce in the target domain, addressing the domain shifts without supervision data is necessary. This paper proposes an unsupervised domain adaptation method by filling vocabulary and word-frequency gaps. First, we expand a vocabulary and execute continual pretraining with a masked language model on a corpus of the target domain. Then, we multiply SPLADE-encoded sparse vectors by inverse document frequency weights to consider the importance of documents with low-frequency words. We conducted experiments using our method on datasets with a large vocabulary gap from a source domain. We show that our method outperforms the present state-of-the-art domain adaptation method. In addition, our method achieves state-of-the-art results, combined with BM25.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"37 1","pages":"752-765"},"PeriodicalIF":0.0,"publicationDate":"2022-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90598204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AACL BiofluxPub Date : 2022-10-27DOI: 10.48550/arXiv.2210.15219
Alberto Muñoz-Ortiz, Mark Anderson, David Vilares, Carlos Gómez-Rodríguez
{"title":"Parsing linearizations appreciate PoS tags - but some are fussy about errors","authors":"Alberto Muñoz-Ortiz, Mark Anderson, David Vilares, Carlos Gómez-Rodríguez","doi":"10.48550/arXiv.2210.15219","DOIUrl":"https://doi.org/10.48550/arXiv.2210.15219","url":null,"abstract":"PoS tags, once taken for granted as a useful resource for syntactic parsing, have become more situational with the popularization of deep learning. Recent work on the impact of PoS tags on graph- and transition-based parsers suggests that they are only useful when tagging accuracy is prohibitively high, or in low-resource scenarios. However, such an analysis is lacking for the emerging sequence labeling parsing paradigm, where it is especially relevant as some models explicitly use PoS tags for encoding and decoding. We undertake a study and uncover some trends. Among them, PoS tags are generally more useful for sequence labeling parsers than for other paradigms, but the impact of their accuracy is highly encoding-dependent, with the PoS-based head-selection encoding being best only when both tagging accuracy and resource availability are high.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"30 1","pages":"117-127"},"PeriodicalIF":0.0,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86763802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AACL BiofluxPub Date : 2022-10-27DOI: 10.48550/arXiv.2210.15183
Li-Kuang Chen, Canasai Kruengkrai, J. Yamagishi
{"title":"Outlier-Aware Training for Improving Group Accuracy Disparities","authors":"Li-Kuang Chen, Canasai Kruengkrai, J. Yamagishi","doi":"10.48550/arXiv.2210.15183","DOIUrl":"https://doi.org/10.48550/arXiv.2210.15183","url":null,"abstract":"Methods addressing spurious correlations such as Just Train Twice (JTT, Liu et al. 2021) involve reweighting a subset of the training set to maximize the worst-group accuracy. However, the reweighted set of examples may potentially contain unlearnable examples that hamper the model’s learning. We propose mitigating this by detecting outliers to the training set and removing them before reweighting. Our experiments show that our method achieves competitive or better accuracy compared with JTT and can detect and remove annotation errors in the subset being reweighted in JTT.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"1 1","pages":"54-60"},"PeriodicalIF":0.0,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88245397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AACL BiofluxPub Date : 2022-10-21DOI: 10.48550/arXiv.2210.12022
Laura Aina, Nikos Voskarides
{"title":"Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks","authors":"Laura Aina, Nikos Voskarides","doi":"10.48550/arXiv.2210.12022","DOIUrl":"https://doi.org/10.48550/arXiv.2210.12022","url":null,"abstract":"Pre-trained language models (LMs) obtain state-of-the-art performance when adapted to text classification tasks. However, when using such models in real world applications, efficiency considerations are paramount. In this paper, we study how different training procedures that adapt LMs to text classification perform, as we vary model and train set size. More specifically, we compare standard fine-tuning, prompting, and knowledge distillation (KD) when the teacher was trained with either fine-tuning or prompting. Our findings suggest that even though fine-tuning and prompting work well to train large LMs on large train sets, there are more efficient alternatives that can reduce compute or data cost. Interestingly, we find that prompting combined with KD can reduce compute and data cost at the same time.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"C-20 12","pages":"244-253"},"PeriodicalIF":0.0,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72608444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AACL BiofluxPub Date : 2022-10-21DOI: 10.48550/arXiv.2210.12223
Florian Lux, Julia Koch, Ngoc Thang Vu
{"title":"Low-Resource Multilingual and Zero-Shot Multispeaker TTS","authors":"Florian Lux, Julia Koch, Ngoc Thang Vu","doi":"10.48550/arXiv.2210.12223","DOIUrl":"https://doi.org/10.48550/arXiv.2210.12223","url":null,"abstract":"While neural methods for text-to-speech (TTS) have shown great advances in modeling multiple speakers, even in zero-shot settings, the amount of data needed for those approaches is generally not feasible for the vast majority of the world’s over 6,000 spoken languages. In this work, we bring together the tasks of zero-shot voice cloning and multilingual low-resource TTS. Using the language agnostic meta learning (LAML) procedure and modifications to a TTS encoder, we show that it is possible for a system to learn speaking a new language using just 5 minutes of training data while retaining the ability to infer the voice of even unseen speakers in the newly learned language. We show the success of our proposed approach in terms of intelligibility, naturalness and similarity to target speaker using objective metrics as well as human studies and provide our code and trained models open source.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"21 1","pages":"741-751"},"PeriodicalIF":0.0,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77981670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AACL BiofluxPub Date : 2022-10-21DOI: 10.48550/arXiv.2210.11787
Prafulla Kumar Choubey, Ruihong Huang
{"title":"Modeling Document-level Temporal Structures for Building Temporal Dependency Graphs","authors":"Prafulla Kumar Choubey, Ruihong Huang","doi":"10.48550/arXiv.2210.11787","DOIUrl":"https://doi.org/10.48550/arXiv.2210.11787","url":null,"abstract":"We propose to leverage news discourse profiling to model document-level temporal structures for building temporal dependency graphs. Our key observation is that the functional roles of sentences used for profiling news discourse signify different time frames relevant to a news story and can, therefore, help to recover the global temporal structure of a document. Our analyses and experiments with the widely used knowledge distillation technique show that discourse profiling effectively identifies distant inter-sentence event and (or) time expression pairs that are temporally related and otherwise difficult to locate.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"125 1","pages":"357-365"},"PeriodicalIF":0.0,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77559929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}