North American Chapter of the Association for Computational Linguistics最新文献_第2页

Disentangled Action Recognition with Knowledge Bases 基于知识库的解纠缠动作识别

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-07-04 DOI: 10.48550/arXiv.2207.01708

Zhekun Luo, Shalini Ghosh, Devin Guillory, Keizo Kato, Trevor Darrell, Huijuan Xu

{"title":"Disentangled Action Recognition with Knowledge Bases","authors":"Zhekun Luo, Shalini Ghosh, Devin Guillory, Keizo Kato, Trevor Darrell, Huijuan Xu","doi":"10.48550/arXiv.2207.01708","DOIUrl":"https://doi.org/10.48550/arXiv.2207.01708","url":null,"abstract":"Action in video usually involves the interaction of human with objects. Action labels are typically composed of various combinations of verbs and nouns, but we may not have training data for all possible combinations. In this paper, we aim to improve the generalization ability of the compositional action recognition model to novel verbs or novel nouns that are unseen during training time, by leveraging the power of knowledge graphs. Previous work utilizes verb-noun compositional action nodes in the knowledge graph, making it inefficient to scale since the number of compositional action nodes grows quadratically with respect to the number of verbs and nouns. To address this issue, we propose our approach: Disentangled Action Recognition with Knowledge-bases (DARK), which leverages the inherent compositionality of actions. DARK trains a factorized model by first extracting disentangled feature representations for verbs and nouns, and then predicting classification weights using relations in external knowledge graphs. The type constraint between verb and noun is extracted from external knowledge bases and finally applied when composing actions. DARK has better scalability in the number of objects and verbs, and achieves state-of-the-art performance on the Charades dataset. We further propose a new benchmark split based on the Epic-kitchen dataset which is an order of magnitude bigger in the numbers of classes and samples, and benchmark various models on this benchmark.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133682545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Generating Repetitions with Appropriate Repeated Words 用适当的重复单词产生重复

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-07-03 DOI: 10.48550/arXiv.2207.00929

Toshiki Kawamoto, Hidetaka Kamigaito, Kotaro Funakoshi, M. Okumura

引用次数: 1

Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging? 掩码词性模型:长上下文建模是否有助于无监督pos标注?

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-06-30 DOI: 10.48550/arXiv.2206.14969

Xiang Zhou, Shiyue Zhang, Mohit Bansal

{"title":"Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging?","authors":"Xiang Zhou, Shiyue Zhang, Mohit Bansal","doi":"10.48550/arXiv.2206.14969","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14969","url":null,"abstract":"Previous Part-Of-Speech (POS) induction models usually assume certain independence assumptions (e.g., Markov, unidirectional, local dependency) that do not hold in real languages. For example, the subject-verb agreement can be both long-term and bidirectional. To facilitate flexible dependency modeling, we propose a Masked Part-of-Speech Model (MPoSM), inspired by the recent success of Masked Language Models (MLM). MPoSM can model arbitrary tag dependency and perform POS induction through the objective of masked POS reconstruction. We achieve competitive results on both the English Penn WSJ dataset as well as the universal treebank containing 10 diverse languages. Though modeling the long-term dependency should ideally help this task, our ablation study shows mixed trends in different languages. To better understand this phenomenon, we design a novel synthetic experiment that can specifically diagnose the model’s ability to learn tag agreement. Surprisingly, we find that even strong baselines fail to solve this problem consistently in a very simplified setting: the agreement between adjacent words. Nonetheless, MPoSM achieves overall better performance. Lastly, we conduct a detailed error analysis to shed light on other remaining challenges.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"50 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131707782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analyzing Encoded Concepts in Transformer Language Models 分析Transformer语言模型中的编码概念

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-06-27 DOI: 10.48550/arXiv.2206.13289

Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, A. Khan, Jia Xu

引用次数: 15

Do Trajectories Encode Verb Meaning? 轨迹编码动词意义吗?

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-06-23 DOI: 10.48550/arXiv.2206.11953

Dylan Ebert, Chen Sun, Ellie Pavlick

引用次数: 2

Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models 基于理论的美国社会刻板印象在英语语言模型中的测量

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-06-23 DOI: 10.48550/arXiv.2206.11684

Yang Trista Cao, Anna Sotnikova, Hal Daum'e, Rachel Rudinger, L. Zou

引用次数: 14

Automatic Correction of Human Translations 人工翻译的自动校正

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-06-17 DOI: 10.48550/arXiv.2206.08593

Jessy Lin, G. Kovács, Aditya Shastry, Joern Wuebker, John DeNero

{"title":"Automatic Correction of Human Translations","authors":"Jessy Lin, G. Kovács, Aditya Shastry, Joern Wuebker, John DeNero","doi":"10.48550/arXiv.2206.08593","DOIUrl":"https://doi.org/10.48550/arXiv.2206.08593","url":null,"abstract":"We introduce translation error correction (TEC), the task of automatically correcting human-generated translations.Imperfections in machine translations (MT) have long motivated systems for improving translations post-hoc with automatic post-editing.In contrast, little attention has been devoted to the problem of automatically correcting human translations, despite the intuition that humans make distinct errors that machines would be well-suited to assist with, from typos to inconsistencies in translation conventions.To investigate this, we build and release the Aced corpus with three TEC datasets (available at: github.com/lilt/tec). We show that human errors in TEC exhibit a more diverse range of errors and far fewer translation fluency errors than the MT errors in automatic post-editing datasets, suggesting the need for dedicated TEC models that are specialized to correct human errors. We show that pre-training instead on synthetic errors based on human errors improves TEC F-score by as much as 5.1 points. We conducted a human-in-the-loop user study with nine professional translation editors and found that the assistance of our TEC system led them to produce significantly higher quality revised translations.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121580189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Multimodal Dialogue State Tracking 多模式对话状态跟踪

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-06-16 DOI: 10.48550/arXiv.2206.07898

Hung Le, Nancy F. Chen, S. Hoi

{"title":"Multimodal Dialogue State Tracking","authors":"Hung Le, Nancy F. Chen, S. Hoi","doi":"10.48550/arXiv.2206.07898","DOIUrl":"https://doi.org/10.48550/arXiv.2206.07898","url":null,"abstract":"Designed for tracking user goals in dialogues, a dialogue state tracker is an essential component in a dialogue system. However, the research of dialogue state tracking has largely been limited to unimodality, in which slots and slot values are limited by knowledge domains (e.g. restaurant domain with slots of restaurant name and price range) and are defined by specific database schema. In this paper, we propose to extend the definition of dialogue state tracking to multimodality. Specifically, we introduce a novel dialogue state tracking task to track the information of visual objects that are mentioned in video-grounded dialogues. Each new dialogue utterance may introduce a new video segment, new visual objects, or new object attributes and a state tracker is required to update these information slots accordingly. We created a new synthetic benchmark and designed a novel baseline, Video-Dialogue Transformer Network (VDTN), for this task. VDTN combines both object-level features and segment-level features and learns contextual dependencies between videos and dialogues to generate multimodal dialogue states. We optimized VDTN for a state generation task as well as a self-supervised video understanding task which recovers video segment or object representations. Finally, we trained VDTN to use the decoded states in a response prediction task. Together with comprehensive ablation and qualitative analysis, we discovered interesting insights towards building more capable multimodal dialogue systems.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116479825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

NewsEdits: A News Article Revision Dataset and a Novel Document-Level Reasoning Challenge 新闻编辑:一个新闻文章修订数据集和一个新的文档级推理挑战

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-06-14 DOI: 10.48550/arXiv.2206.07106

Alexander Spangher, Xiang Ren, Jonathan May, Nanyun Peng

{"title":"NewsEdits: A News Article Revision Dataset and a Novel Document-Level Reasoning Challenge","authors":"Alexander Spangher, Xiang Ren, Jonathan May, Nanyun Peng","doi":"10.48550/arXiv.2206.07106","DOIUrl":"https://doi.org/10.48550/arXiv.2206.07106","url":null,"abstract":"News article revision histories provide clues to narrative and factual evolution in news articles. To facilitate analysis of this evolution, we present the first publicly available dataset of news revision histories, NewsEdits. Our dataset is large-scale and multilingual; it contains 1.2 million articles with 4.6 million versions from over 22 English- and French-language newspaper sources based in three countries, spanning 15 years of coverage (2006-2021).We define article-level edit actions: Addition, Deletion, Edit and Refactor, and develop a high-accuracy extraction algorithm to identify these actions. To underscore the factual nature of many edit actions, we conduct analyses showing that added and deleted sentences are more likely to contain updating events, main content and quotes than unchanged sentences. Finally, to explore whether edit actions are predictable, we introduce three novel tasks aimed at predicting actions performed during version updates. We show that these tasks are possible for expert humans but are challenging for large NLP models. We hope this can spur research in narrative framing and help provide predictive tools for journalists chasing breaking news.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133602299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

CoSe-Co: Text Conditioned Generative CommonSense Contextualizer CoSe-Co:文本条件生成常识语境器

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-06-12 DOI: 10.48550/arXiv.2206.05706

Rachit Bansal, Milan Aggarwal, S. Bhatia, Jivat Neet Kaur, Balaji Krishnamurthy

{"title":"CoSe-Co: Text Conditioned Generative CommonSense Contextualizer","authors":"Rachit Bansal, Milan Aggarwal, S. Bhatia, Jivat Neet Kaur, Balaji Krishnamurthy","doi":"10.48550/arXiv.2206.05706","DOIUrl":"https://doi.org/10.48550/arXiv.2206.05706","url":null,"abstract":"Pre-trained Language Models (PTLMs) have been shown to perform well on natural language tasks. Many prior works have leveraged structured commonsense present in the form of entities linked through labeled relations in Knowledge Graphs (KGs) to assist PTLMs. Retrieval approaches use KG as a separate static module which limits coverage since KGs contain finite knowledge. Generative methods train PTLMs on KG triples to improve the scale at which knowledge can be obtained. However, training on symbolic KG entities limits their applicability in tasks involving natural language text where they ignore overall context. To mitigate this, we propose a CommonSense Contextualizer (CoSe-Co) conditioned on sentences as input to make it generically usable in tasks for generating knowledge relevant to the overall context of input text. To train CoSe-Co, we propose a novel dataset comprising of sentence and commonsense knowledge pairs. The knowledge inferred by CoSe-Co is diverse and contain novel entities not present in the underlying KG. We augment generated knowledge in Multi-Choice QA and Open-ended CommonSense Reasoning tasks leading to improvements over current best methods on CSQA, ARC, QASC and OBQA datasets. We also demonstrate its applicability in improving performance of a baseline model for paraphrase generation task.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127864109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3