Conference on Empirical Methods in Natural Language Processing最新文献_第8页

Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models 结构引物展示多语言语言模型中的抽象语法表征

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-11-15 DOI: 10.48550/arXiv.2311.09194

J. Michaelov, Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen

{"title":"Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models","authors":"J. Michaelov, Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen","doi":"10.48550/arXiv.2311.09194","DOIUrl":"https://doi.org/10.48550/arXiv.2311.09194","url":null,"abstract":"Abstract grammatical knowledge - of parts of speech and grammatical patterns - is key to the capacity for linguistic generalization in humans. But how abstract is grammatical knowledge in large language models? In the human literature, compelling evidence for grammatical abstraction comes from structural priming. A sentence that shares the same grammatical structure as a preceding sentence is processed and produced more readily. Because confounds exist when using stimuli in a single language, evidence of abstraction is even more compelling from crosslingual structural priming, where use of a syntactic structure in one language primes an analogous structure in another language. We measure crosslingual structural priming in large language models, comparing model behavior to human experimental results from eight crosslingual experiments covering six languages, and four monolingual structural priming experiments in three non-English languages. We find evidence for abstract monolingual and crosslingual grammatical representations in the models that function similarly to those found in humans. These results demonstrate that grammatical representations in multilingual language models are not only similar across languages, but they can causally influence text produced in different languages.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"AES-14 5","pages":"3703-3720"},"PeriodicalIF":0.0,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139271135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Token Prediction as Implicit Classification to Identify LLM-Generated Text 以标记预测作为隐含分类来识别 LLM 生成的文本

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-11-15 DOI: 10.48550/arXiv.2311.08723

Yutian Chen, Hao Kang, Vivian Zhai, Liangze Li, Rita Singh, Bhiksha Raj

引用次数: 0

AART: AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications AART：人工智能辅助红队，为新的 LLM 驱动型应用生成多样化数据

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-11-14 DOI: 10.48550/arXiv.2311.08592

Bhaktipriya Radharapu, Kevin Robinson, Lora Aroyo, Preethi Lahoti

{"title":"AART: AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications","authors":"Bhaktipriya Radharapu, Kevin Robinson, Lora Aroyo, Preethi Lahoti","doi":"10.48550/arXiv.2311.08592","DOIUrl":"https://doi.org/10.48550/arXiv.2311.08592","url":null,"abstract":"Adversarial testing of large language models (LLMs) is crucial for their safe and responsible deployment. We introduce a novel approach for automated generation of adversarial evaluation datasets to test the safety of LLM generations on new downstream applications. We call it AI-assisted Red-Teaming (AART) - an automated alternative to current manual red-teaming efforts. AART offers a data generation and augmentation pipeline of reusable and customizable recipes that reduce human effort significantly and enable integration of adversarial testing earlier in new product development. AART generates evaluation datasets with high diversity of content characteristics critical for effective adversarial testing (e.g. sensitive and harmful concepts, specific to a wide range of cultural and geographic regions and application scenarios). The data generation is steered by AI-assisted recipes to define, scope and prioritize diversity within the application context. This feeds into a structured LLM-generation process that scales up evaluation priorities. Compared to some state-of-the-art tools, AART shows promising results in terms of concept coverage and data quality.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"54 7","pages":"380-395"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139277325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving Image Captioning via Predicting Structured Concepts 通过预测结构化概念改进图像字幕制作

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-11-14 DOI: 10.48550/arXiv.2311.08223

Ting Wang, Weidong Chen, Yuanhe Tian, Yan Song, Zhendong Mao

{"title":"Improving Image Captioning via Predicting Structured Concepts","authors":"Ting Wang, Weidong Chen, Yuanhe Tian, Yan Song, Zhendong Mao","doi":"10.48550/arXiv.2311.08223","DOIUrl":"https://doi.org/10.48550/arXiv.2311.08223","url":null,"abstract":"Having the difficulty of solving the semantic gap between images and texts for the image captioning task, conventional studies in this area paid some attention to treating semantic concepts as a bridge between the two modalities and improved captioning performance accordingly. Although promising results on concept prediction were obtained, the aforementioned studies normally ignore the relationship among concepts, which relies on not only objects in the image, but also word dependencies in the text, so that offers a considerable potential for improving the process of generating good descriptions. In this paper, we propose a structured concept predictor (SCP) to predict concepts and their structures, then we integrate them into captioning, so as to enhance the contribution of visual signals in this task via concepts and further use their relations to distinguish cross-modal semantics for better description generation. Particularly, we design weighted graph convolutional networks (W-GCN) to depict concept relations driven by word dependencies, and then learns differentiated contributions from these concepts for following decoding process. Therefore, our approach captures potential relations among concepts and discriminatively learns different concepts, so that effectively facilitates image captioning with inherited information across modalities. Extensive experiments and their results demonstrate the effectiveness of our approach as well as each proposed module in this work.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"3 8","pages":"360-370"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139277956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How Well Do Text Embedding Models Understand Syntax? 文本嵌入模型对语法的理解有多深？

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-11-14 DOI: 10.48550/arXiv.2311.07996

Yan Zhang, Zhaopeng Feng, Zhiyang Teng, Zuozhu Liu, Haizhou Li

{"title":"How Well Do Text Embedding Models Understand Syntax?","authors":"Yan Zhang, Zhaopeng Feng, Zhiyang Teng, Zuozhu Liu, Haizhou Li","doi":"10.48550/arXiv.2311.07996","DOIUrl":"https://doi.org/10.48550/arXiv.2311.07996","url":null,"abstract":"Text embedding models have significantly contributed to advancements in natural language processing by adeptly capturing semantic properties of textual data. However, the ability of these models to generalize across a wide range of syntactic contexts remains under-explored. In this paper, we first develop an evaluation set, named textbf{SR}, to scrutinize the capability for syntax understanding of text embedding models from two crucial syntactic aspects: Structural heuristics, and Relational understanding among concepts, as revealed by the performance gaps in previous studies. Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges, and such ineffectiveness becomes even more apparent when evaluated against existing benchmark datasets. Furthermore, we conduct rigorous analysis to unearth factors that lead to such limitations and examine why previous evaluations fail to detect such ineffectiveness. Lastly, we propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios. This study serves to highlight the hurdles associated with syntactic generalization and provides pragmatic guidance for boosting model performance across varied syntactic contexts.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"27 1","pages":"9717-9728"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139277568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Simple and Effective Input Reformulations for Translation 简单有效的翻译输入重构

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-11-12 DOI: 10.18653/v1/2023.emnlp-main.638

Brian Yu, Hansen Lillemark, Kurt Keutzer

引用次数: 0

Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources Dialogizer：从文本源生成上下文感知的对话-质量保证数据集

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-11-09 DOI: 10.48550/arXiv.2311.07589

Yerin Hwang, Yongi-Mi Kim, Hyunkyung Bae, Jeesoo Bang, Hwanhee Lee, Kyomin Jung

{"title":"Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources","authors":"Yerin Hwang, Yongi-Mi Kim, Hyunkyung Bae, Jeesoo Bang, Hwanhee Lee, Kyomin Jung","doi":"10.48550/arXiv.2311.07589","DOIUrl":"https://doi.org/10.48550/arXiv.2311.07589","url":null,"abstract":"To address the data scarcity issue in Conversational question answering (ConvQA), a dialog inpainting method, which utilizes documents to generate ConvQA datasets, has been proposed. However, the original dialog inpainting model is trained solely on the dialog reconstruction task, resulting in the generation of questions with low contextual relevance due to insufficient learning of question-answer alignment. To overcome this limitation, we propose a novel framework called Dialogizer, which has the capability to automatically generate ConvQA datasets with high contextual relevance from textual sources. The framework incorporates two training tasks: question-answer matching (QAM) and topic-aware dialog generation (TDG). Moreover, re-ranking is conducted during the inference phase based on the contextual relevance of the generated questions. Using our framework, we produce four ConvQA datasets by utilizing documents from multiple domains as the primary source. Through automatic evaluation using diverse metrics, as well as human evaluation, we validate that our proposed framework exhibits the ability to generate datasets of higher quality compared to the baseline dialog inpainting model.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"66 1","pages":"8806-8828"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139282203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Visually Grounded Continual Language Learning with Selective Specialization 以视觉为基础的选择性专业化语言持续学习

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-10-24 DOI: 10.18653/v1/2023.findings-emnlp.469

Kyra Ahrens, Lennart Bengtson, Jae Hee Lee, Stefan Wermter

{"title":"Visually Grounded Continual Language Learning with Selective Specialization","authors":"Kyra Ahrens, Lennart Bengtson, Jae Hee Lee, Stefan Wermter","doi":"10.18653/v1/2023.findings-emnlp.469","DOIUrl":"https://doi.org/10.18653/v1/2023.findings-emnlp.469","url":null,"abstract":"A desirable trait of an artificial agent acting in the visual world is to continually learn a sequence of language-informed tasks while striking a balance between sufficiently specializing in each task and building a generalized knowledge for transfer. Selective specialization, i.e., a careful selection of model components to specialize in each task, is a strategy to provide control over this trade-off. However, the design of selection strategies requires insights on the role of each model component in learning rather specialized or generalizable representations, which poses a gap in current research. Thus, our aim with this work is to provide an extensive analysis of selection strategies for visually grounded continual language learning. Due to the lack of suitable benchmarks for this purpose, we introduce two novel diagnostic datasets that provide enough control and flexibility for a thorough model analysis. We assess various heuristics for module specialization strategies as well as quantifiable measures for two different types of model architectures. Finally, we design conceptually simple approaches based on our analysis that outperform common continual learning baselines. Our results demonstrate the need for further efforts towards better aligning continual learning algorithms with the learning behaviors of individual model parts.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"34 1","pages":"7037-7054"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139314712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysing State-Backed Propaganda Websites: a New Dataset and Linguistic Study 分析国家支持的宣传网站：新数据集和语言学研究

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-10-21 DOI: 10.18653/v1/2023.emnlp-main.349

Freddy Heppell, Kalina Bontcheva, Carolina Scarton

引用次数: 0

DistillCSE: Distilled Contrastive Learning for Sentence Embeddings DistillCSE：针对句子嵌入的精炼对比学习

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-10-20 DOI: 10.18653/v1/2023.findings-emnlp.547

Jiahao Xu, Wei Shao, Lihui Chen, Lemao Liu

{"title":"DistillCSE: Distilled Contrastive Learning for Sentence Embeddings","authors":"Jiahao Xu, Wei Shao, Lihui Chen, Lemao Liu","doi":"10.18653/v1/2023.findings-emnlp.547","DOIUrl":"https://doi.org/10.18653/v1/2023.findings-emnlp.547","url":null,"abstract":"This paper proposes the DistillCSE framework, which performs contrastive learning under the self-training paradigm with knowledge distillation. The potential advantage of DistillCSE is its self-enhancing feature: using a base model to provide additional supervision signals, a stronger model may be learned through knowledge distillation. However, the vanilla DistillCSE through the standard implementation of knowledge distillation only achieves marginal improvements due to severe overfitting. The further quantitative analyses demonstrate the reason that the standard knowledge distillation exhibits a relatively large variance of the teacher model's logits due to the essence of contrastive learning. To mitigate the issue induced by high variance, this paper accordingly proposed two simple yet effective solutions for knowledge distillation: a Group-P shuffling strategy as an implicit regularization and the averaging logits from multiple teacher components. Experiments on standard benchmarks demonstrate that the proposed DistillCSE outperforms many strong baseline methods and yields a new state-of-the-art performance.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"32 1","pages":"8153-8165"},"PeriodicalIF":0.0,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139316205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0