AACL Bioflux最新文献

HaRiM^+: Evaluating Summary Quality with Hallucination Risk HaRiM^+:用幻觉风险评估总结质量

AACL Bioflux Pub Date : 2022-11-22 DOI: 10.48550/arXiv.2211.12118

Seonil Son, Junsoo Park, J. Hwang, Junghwa Lee, Hyungjong Noh, Yeonsoo Lee

引用次数: 1

PESE: Event Structure Extraction using Pointer Network based Encoder-Decoder Architecture 基于指针网络的编码器-解码器结构的事件结构提取

AACL Bioflux Pub Date : 2022-11-22 DOI: 10.48550/arXiv.2211.12157

Alapan Kuila, Sudeshan Sarkar

引用次数: 0

Bipartite-play Dialogue Collection for Practical Automatic Evaluation of Dialogue Systems 对话系统实用自动评价的双方对话采集

AACL Bioflux Pub Date : 2022-11-19 DOI: 10.48550/arXiv.2211.10596

Shiki Sato, Yosuke Kishinami, Hiroaki Sugiyama, Reina Akama, Ryoko Tokuhisa, Jun Suzuki

引用次数: 2

Local Structure Matters Most in Most Languages 局部结构在大多数语言中最重要

AACL Bioflux Pub Date : 2022-11-09 DOI: 10.48550/arXiv.2211.05025

Louis Clouâtre, Prasanna Parthasarathi, A. Zouaq, Sarath Chandar

引用次数: 1

Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps 基于词汇和词频间隙填充的无监督域自适应稀疏检索

AACL Bioflux Pub Date : 2022-11-08 DOI: 10.48550/arXiv.2211.03988

Hiroki Iida, Naoaki Okazaki

{"title":"Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps","authors":"Hiroki Iida, Naoaki Okazaki","doi":"10.48550/arXiv.2211.03988","DOIUrl":"https://doi.org/10.48550/arXiv.2211.03988","url":null,"abstract":"IR models using a pretrained language model significantly outperform lexical approaches like BM25. In particular, SPLADE, which encodes texts to sparse vectors, is an effective model for practical use because it shows robustness to out-of-domain datasets. However, SPLADE still struggles with exact matching of low-frequency words in training data. In addition, domain shifts in vocabulary and word frequencies deteriorate the IR performance of SPLADE. Because supervision data are scarce in the target domain, addressing the domain shifts without supervision data is necessary. This paper proposes an unsupervised domain adaptation method by filling vocabulary and word-frequency gaps. First, we expand a vocabulary and execute continual pretraining with a masked language model on a corpus of the target domain. Then, we multiply SPLADE-encoded sparse vectors by inverse document frequency weights to consider the importance of documents with low-frequency words. We conducted experiments using our method on datasets with a large vocabulary gap from a source domain. We show that our method outperforms the present state-of-the-art domain adaptation method. In addition, our method achieves state-of-the-art results, combined with BM25.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"37 1","pages":"752-765"},"PeriodicalIF":0.0,"publicationDate":"2022-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90598204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Parsing linearizations appreciate PoS tags - but some are fussy about errors 解析线性化欣赏PoS标记——但有些人对错误很挑剔

AACL Bioflux Pub Date : 2022-10-27 DOI: 10.48550/arXiv.2210.15219

Alberto Muñoz-Ortiz, Mark Anderson, David Vilares, Carlos Gómez-Rodríguez

引用次数: 1

Outlier-Aware Training for Improving Group Accuracy Disparities 提高群体准确度差异的离群值感知训练

AACL Bioflux Pub Date : 2022-10-27 DOI: 10.48550/arXiv.2210.15183

Li-Kuang Chen, Canasai Kruengkrai, J. Yamagishi

引用次数: 0

Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks 适应文本分类任务的语言模型的性能-效率权衡

AACL Bioflux Pub Date : 2022-10-21 DOI: 10.48550/arXiv.2210.12022

Laura Aina, Nikos Voskarides