Transactions of the Association for Computational Linguistics最新文献_第9页

Modeling Non-Cooperative Dialogue: Theoretical and Empirical Insights 非合作对话建模：理论与实证研究

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-07-15 DOI: 10.1162/tacl_a_00507

Anthony Sicilia, Tristan D. Maidment, Pat Healy, Malihe Alikhani

引用次数: 2

Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions 让BART搭上习语列车：学会表达习语

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-07-08 DOI: 10.1162/tacl_a_00510

Ziheng Zeng, S. Bhat

引用次数: 1

Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation 元学习差异:为有效适应准备大型语言模型

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-07-07 DOI: 10.1162/tacl_a_00517

Zejiang Hou, Julian Salazar, George Polovets

引用次数: 7

The Parallelism Tradeoff: Limitations of Log-Precision Transformers 并行性权衡:对数精度变压器的局限性

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-07-02 DOI: 10.1162/tacl_a_00562

William Cooper Merrill, Ashish Sabharwal

{"title":"The Parallelism Tradeoff: Limitations of Log-Precision Transformers","authors":"William Cooper Merrill, Ashish Sabharwal","doi":"10.1162/tacl_a_00562","DOIUrl":"https://doi.org/10.1162/tacl_a_00562","url":null,"abstract":"Despite their omnipresence in modern NLP, characterizing the computational power of transformer neural nets remains an interesting open question. We prove that transformers whose arithmetic precision is logarithmic in the number of input tokens (and whose feedforward nets are computable using space linear in their input) can be simulated by constant-depth logspace-uniform threshold circuits. This provides insight on the power of transformers using known results in complexity theory. For example, if L≠P (i.e., not all poly-time problems can be solved using logarithmic space), then transformers cannot even accurately solve linear equalities or check membership in an arbitrary context-free grammar with empty productions. Our result intuitively emerges from the transformer architecture’s high parallelizability. We thus speculatively introduce the idea of a fundamental parallelism tradeoff: any model architecture as parallelizable as the transformer will obey limitations similar to it. Since parallelism is key to training models at massive scale, this suggests a potential inherent weakness of the scaling paradigm.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"531-545"},"PeriodicalIF":10.9,"publicationDate":"2022-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46501624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

InSCIt: Information-Seeking Conversations with Mixed-Initiative Interactions InSCIt:混合主动互动的信息寻求对话

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-07-02 DOI: 10.1162/tacl_a_00559

Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi

{"title":"InSCIt: Information-Seeking Conversations with Mixed-Initiative Interactions","authors":"Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi","doi":"10.1162/tacl_a_00559","DOIUrl":"https://doi.org/10.1162/tacl_a_00559","url":null,"abstract":"In an information-seeking conversation, a user may ask questions that are under-specified or unanswerable. An ideal agent would interact by initiating different response types according to the available knowledge sources. However, most current studies either fail to or artificially incorporate such agent-side initiative. This work presents InSCIt, a dataset for Information-Seeking Conversations with mixed-initiative Interactions. It contains 4.7K user-agent turns from 805 human-human conversations where the agent searches over Wikipedia and either directly answers, asks for clarification, or provides relevant information to address user queries. The data supports two subtasks, evidence passage identification and response generation, as well as a human evaluation protocol to assess model performance. We report results of two systems based on state-of-the-art models of conversational knowledge identification and open-domain question answering. Both systems significantly underperform humans, suggesting ample room for improvement in future studies.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"453-468"},"PeriodicalIF":10.9,"publicationDate":"2022-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43591966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Conditional Generation with a Question-Answering Blueprint 带问答蓝图的条件生成

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-07-01 DOI: 10.1162/tacl_a_00583

Shashi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Dipanjan Das, Mirella Lapata

{"title":"Conditional Generation with a Question-Answering Blueprint","authors":"Shashi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Dipanjan Das, Mirella Lapata","doi":"10.1162/tacl_a_00583","DOIUrl":"https://doi.org/10.1162/tacl_a_00583","url":null,"abstract":"Abstract The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. We propose a new conceptualization of text plans as a sequence of question-answer (QA) pairs and enhance existing datasets (e.g., for summarization) with a QA blueprint operating as a proxy for content selection (i.e., what to say) and planning (i.e., in what order). We obtain blueprints automatically by exploiting state-of-the-art question generation technology and convert input-output pairs into input-blueprint-output tuples. We develop Transformer-based models, each varying in how they incorporate the blueprint in the generated output (e.g., as a global plan or iteratively). Evaluation across metrics and datasets demonstrates that blueprint models are more factual than alternatives which do not resort to planning and allow tighter control of the generation output.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"974-996"},"PeriodicalIF":10.9,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45704432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method 会话历史表征在会话问答中的稳健性研究——一种基于提示的新方法

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-06-29 DOI: 10.1162/tacl_a_00549

Zorik Gekhman, Nadav Oved, Orgad Keller, Idan Szpektor, Roi Reichart

{"title":"On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method","authors":"Zorik Gekhman, Nadav Oved, Orgad Keller, Idan Szpektor, Roi Reichart","doi":"10.1162/tacl_a_00549","DOIUrl":"https://doi.org/10.1162/tacl_a_00549","url":null,"abstract":"Most work on modeling the conversation history in Conversational Question Answering (CQA) reports a single main result on a common CQA benchmark. While existing models show impressive results on CQA leaderboards, it remains unclear whether they are robust to shifts in setting (sometimes to more realistic ones), training data size (e.g., from large to small sets) and domain. In this work, we design and conduct the first large-scale robustness study of history modeling approaches for CQA. We find that high benchmark scores do not necessarily translate to strong robustness, and that various methods can perform extremely differently under different settings. Equipped with the insights from our study, we design a novel prompt-based history modeling approach and demonstrate its strong robustness across various settings. Our approach is inspired by existing methods that highlight historic answers in the passage. However, instead of highlighting by modifying the passage token embeddings, we add textual prompts directly in the passage text. Our approach is simple, easy to plug into practically any model, and highly effective, thus we recommend it as a starting point for future model developers. We also hope that our study and insights will raise awareness to the importance of robustness-focused evaluation, in addition to obtaining high leaderboard scores, leading to better CQA systems.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"351-366"},"PeriodicalIF":10.9,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45171232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Dependency Parsing with Backtracking using Deep Reinforcement Learning 依赖解析与回溯使用深度强化学习

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-06-28 DOI: 10.1162/tacl_a_00496

Franck Dary, M. Petit, Alexis Nasr

引用次数: 0

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon DP解析：使用实例词典从原始语音中查找单词边界

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-06-22 DOI: 10.1162/tacl_a_00505

Robin Algayres, Tristan Ricoul, Julien Karadayi, Hugo Laurenccon, Salah Zaiem, Abdel-rahman Mohamed, Benoît Sagot, E. Dupoux

引用次数: 6

Questions Are All You Need to Train a Dense Passage Retriever 训练密集通道寻回犬所需的全部问题

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2022-06-21 DOI: 10.1162/tacl_a_00564

Devendra Singh Sachan, M. Lewis, Dani Yogatama, Luke Zettlemoyer, J. Pineau, M. Zaheer

{"title":"Questions Are All You Need to Train a Dense Passage Retriever","authors":"Devendra Singh Sachan, M. Lewis, Dani Yogatama, Luke Zettlemoyer, J. Pineau, M. Zaheer","doi":"10.1162/tacl_a_00564","DOIUrl":"https://doi.org/10.1162/tacl_a_00564","url":null,"abstract":"We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data. Dense retrieval is a central challenge for open-domain tasks, such as Open QA, where state-of-the-art methods typically require large supervised datasets with custom hard-negative mining and denoising of positive examples. ART, in contrast, only requires access to unpaired inputs and outputs (e.g., questions and potential answer passages). It uses a new passage-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence passages, and (2) the passages are then used to compute the probability of reconstructing the original question. Training for retrieval based on question reconstruction enables effective unsupervised learning of both passage and question encoders, which can be later incorporated into complete Open QA systems without any further finetuning. Extensive experiments demonstrate that ART obtains state-of-the-art results on multiple QA retrieval benchmarks with only generic initialization from a pre-trained language model, removing the need for labeled data and task-specific losses.1 Our code and model checkpoints are available at: https://github.com/DevSinghSachan/art.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"600-616"},"PeriodicalIF":10.9,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43642220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22