Annual Meeting of the Association for Computational Linguistics最新文献

筛选
英文 中文
Class-Adaptive Self-Training for Relation Extraction with Incompletely Annotated Training Data 不完全标注训练数据关系抽取的类自适应自训练
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-16 DOI: 10.48550/arXiv.2306.09697
Qingyu Tan, Lu Xu, Lidong Bing, H. Ng
{"title":"Class-Adaptive Self-Training for Relation Extraction with Incompletely Annotated Training Data","authors":"Qingyu Tan, Lu Xu, Lidong Bing, H. Ng","doi":"10.48550/arXiv.2306.09697","DOIUrl":"https://doi.org/10.48550/arXiv.2306.09697","url":null,"abstract":"Relation extraction (RE) aims to extract relations from sentences and documents. Existing relation extraction models typically rely on supervised machine learning. However, recent studies showed that many RE datasets are incompletely annotated. This is known as the false negative problem in which valid relations are falsely annotated as 'no_relation'. Models trained with such data inevitably make similar mistakes during the inference stage. Self-training has been proven effective in alleviating the false negative problem. However, traditional self-training is vulnerable to confirmation bias and exhibits poor performance in minority classes. To overcome this limitation, we proposed a novel class-adaptive re-sampling self-training framework. Specifically, we re-sampled the pseudo-labels for each class by precision and recall scores. Our re-sampling strategy favored the pseudo-labels of classes with high precision and low recall, which improved the overall recall without significantly compromising precision. We conducted experiments on document-level and biomedical relation extraction datasets, and the results showed that our proposed self-training framework consistently outperforms existing competitive methods on the Re-DocRED and ChemDisgene datasets when the training data are incompletely annotated. Our code is released at https://github.com/DAMO-NLP-SG/CAST.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123285478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Differentiable Instruction Optimization for Cross-Task Generalization 跨任务泛化的可微指令优化
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-16 DOI: 10.48550/arXiv.2306.10098
Masaru Isonuma, Junichiro Mori, I. Sakata
{"title":"Differentiable Instruction Optimization for Cross-Task Generalization","authors":"Masaru Isonuma, Junichiro Mori, I. Sakata","doi":"10.48550/arXiv.2306.10098","DOIUrl":"https://doi.org/10.48550/arXiv.2306.10098","url":null,"abstract":"Instruction tuning has been attracting much attention to achieve generalization ability across a wide variety of tasks. Although various types of instructions have been manually created for instruction tuning, it is still unclear what kind of instruction is optimal to obtain cross-task generalization ability. This work presents instruction optimization, which optimizes training instructions with respect to generalization ability. Rather than manually tuning instructions, we introduce learnable instructions and optimize them with gradient descent by leveraging bilevel optimization. Experimental results show that the learned instruction enhances the diversity of instructions and improves the generalization ability compared to using only manually created instructions.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125761381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain 以CLIPScores为隐式参考链的PhotoBook参考博弈的听者模型
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-16 DOI: 10.48550/arXiv.2306.09607
Shih-Lun Wu, Yi-Hui Chou, Liang Li
{"title":"Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain","authors":"Shih-Lun Wu, Yi-Hui Chou, Liang Li","doi":"10.48550/arXiv.2306.09607","DOIUrl":"https://doi.org/10.48550/arXiv.2306.09607","url":null,"abstract":"PhotoBook is a collaborative dialogue game where two players receive private, partially-overlapping sets of images and resolve which images they have in common.It presents machines with a great challenge to learn how people build common ground around multimodal context to communicate effectively.Methods developed in the literature, however, cannot be deployed to real gameplaysince they only tackle some subtasks of the game,and they require additional reference chains inputs, whose extraction process is imperfect.Therefore, we propose a reference chain-free listener modelthat directly addresses the game’s predictive task, i.e., deciding whether an image is shared with partner.Our DeBERTa-based listener model reads the full dialogue, and utilizesCLIPScore features to assess utterance-image relevance.We achieve >77% accuracy on unseen sets of images/game themes, outperforming baseline by >17 points.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116315245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets 八月:合成会话推荐数据集的自动生成替代研究
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-16 DOI: 10.48550/arXiv.2306.09631
Yu Lu, Junwei Bao, Zichen Ma, Xiaoguang Han, Youzheng Wu, Shuguang Cui, Xiaodong He
{"title":"AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets","authors":"Yu Lu, Junwei Bao, Zichen Ma, Xiaoguang Han, Youzheng Wu, Shuguang Cui, Xiaodong He","doi":"10.48550/arXiv.2306.09631","DOIUrl":"https://doi.org/10.48550/arXiv.2306.09631","url":null,"abstract":"High-quality data is essential for conversational recommendation systems and serves as the cornerstone of the network architecture development and training strategy design. Existing works contribute heavy human efforts to manually labeling or designing and extending recommender dialogue templates. However, they suffer from (i) the limited number of human annotators results in that datasets can hardly capture rich and large-scale cases in the real world, (ii) the limited experience and knowledge of annotators account for the uninformative corpus and inappropriate recommendations. In this paper, we propose a novel automatic dataset synthesis approach that can generate both large-scale and high-quality recommendation dialogues through a data2text generation process, where unstructured recommendation conversations are generated from structured graphs based on user-item information from the real world. In doing so, we comprehensively exploit: (i) rich personalized user profiles from traditional recommendation datasets, (ii) rich external knowledge from knowledge graphs, and (iii) the conversation ability contained in human-to-human conversational recommendation datasets. Extensive experiments validate the benefit brought by the automatically synthesized data under low-resource scenarios and demonstrate the promising potential to facilitate the development of a more effective conversational recommendation system.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133956999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Opinion Tree Parsing for Aspect-based Sentiment Analysis 基于方面的情感分析的意见树解析
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-15 DOI: 10.48550/arXiv.2306.08925
Xiaoyi Bao, Xiaotong Jiang, Zhongqing Wang, Yue Zhang, Guodong Zhou
{"title":"Opinion Tree Parsing for Aspect-based Sentiment Analysis","authors":"Xiaoyi Bao, Xiaotong Jiang, Zhongqing Wang, Yue Zhang, Guodong Zhou","doi":"10.48550/arXiv.2306.08925","DOIUrl":"https://doi.org/10.48550/arXiv.2306.08925","url":null,"abstract":"Extracting sentiment elements using pre-trained generative models has recently led to large improvements in aspect-based sentiment analysis benchmarks. However, these models always need large-scale computing resources, and they also ignore explicit modeling of structure between sentiment elements. To address these challenges, we propose an opinion tree parsing model, aiming to parse all the sentiment elements from an opinion tree, which is much faster, and can explicitly reveal a more comprehensive and complete aspect-level sentiment structure. In particular, we first introduce a novel context-free opinion grammar to normalize the opinion tree structure. We then employ a neural chart-based opinion tree parser to fully explore the correlations among sentiment elements and parse them into an opinion tree structure. Extensive experiments show the superiority of our proposed model and the capacity of the opinion tree parser with the proposed context-free opinion grammar. More importantly, the results also prove that our model is much faster than previous models.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126654858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models 大型语言模型时间推理能力的标杆化与改进
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-15 DOI: 10.48550/arXiv.2306.08952
Qingyu Tan, H. Ng, Lidong Bing
{"title":"Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models","authors":"Qingyu Tan, H. Ng, Lidong Bing","doi":"10.48550/arXiv.2306.08952","DOIUrl":"https://doi.org/10.48550/arXiv.2306.08952","url":null,"abstract":"Reasoning about time is of fundamental importance. Many facts are time-dependent. For example, athletes change teams from time to time, and different government officials are elected periodically. Previous time-dependent question answering (QA) datasets tend to be biased in either their coverage of time spans or question types. In this paper, we introduce a comprehensive probing dataset TempReason to evaluate the temporal reasoning capability of large language models. Our dataset includes questions of three temporal reasoning levels. In addition, we also propose a novel learning framework to improve the temporal reasoning capability of large language models, based on temporal span extraction and time-sensitive reinforcement learning. We conducted experiments in closed book QA, open book QA, and reasoning QA settings and demonstrated the effectiveness of our approach.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130793971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Learning by Analogy: Diverse Questions Generation in Math Word Problem 类比学习:数学应用题中不同问题的生成
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-15 DOI: 10.48550/arXiv.2306.09064
Zihao Zhou, Maizhen Ning, Qiufeng Wang, Jie Yao, Wei Wang, Xiaowei Huang, Kaizhu Huang
{"title":"Learning by Analogy: Diverse Questions Generation in Math Word Problem","authors":"Zihao Zhou, Maizhen Ning, Qiufeng Wang, Jie Yao, Wei Wang, Xiaowei Huang, Kaizhu Huang","doi":"10.48550/arXiv.2306.09064","DOIUrl":"https://doi.org/10.48550/arXiv.2306.09064","url":null,"abstract":"Solving math word problem (MWP) with AI techniques has recently made great progress with the success of deep neural networks (DNN), but it is far from being solved. We argue that the ability of learning by analogy is essential for an MWP solver to better understand same problems which may typically be formulated in diverse ways. However most existing works exploit the shortcut learning to train MWP solvers simply based on samples with a single question. In lack of diverse questions, these methods merely learn shallow heuristics. In this paper, we make a first attempt to solve MWPs by generating diverse yet consistent questions/equations. Given a typical MWP including the scenario description, question, and equation (i.e., answer), we first generate multiple consistent equations via a group of heuristic rules. We then feed them to a question generator together with the scenario to obtain the corresponding diverse questions, forming a new MWP with a variety of questions and equations. Finally we engage a data filter to remove those unreasonable MWPs, keeping the high-quality augmented ones. To evaluate the ability of learning by analogy for an MWP solver, we generate a new MWP dataset (called DiverseMath23K) with diverse questions by extending the current benchmark Math23K. Extensive experimental results demonstrate that our proposed method can generate high-quality diverse questions with corresponding equations, further leading to performance improvement on Diverse-Math23K. The code and dataset is available at: https://github.com/zhouzihao501/DiverseMWP","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117053429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multi-target Backdoor Attacks for Code Pre-trained Models 代码预训练模型的多目标后门攻击
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-14 DOI: 10.48550/arXiv.2306.08350
Yanzhou Li, Shangqing Liu, Kangjie Chen, Xiaofei Xie, Tianwei Zhang, Yang Liu
{"title":"Multi-target Backdoor Attacks for Code Pre-trained Models","authors":"Yanzhou Li, Shangqing Liu, Kangjie Chen, Xiaofei Xie, Tianwei Zhang, Yang Liu","doi":"10.48550/arXiv.2306.08350","DOIUrl":"https://doi.org/10.48550/arXiv.2306.08350","url":null,"abstract":"Backdoor attacks for neural code models have gained considerable attention due to the advancement of code intelligence. However, most existing works insert triggers into task-specific data for code-related downstream tasks, thereby limiting the scope of attacks. Moreover, the majority of attacks for pre-trained models are designed for understanding tasks. In this paper, we propose task-agnostic backdoor attacks for code pre-trained models. Our backdoored model is pre-trained with two learning strategies (i.e., Poisoned Seq2Seq learning and token representation learning) to support the multi-target attack of downstream code understanding and generation tasks. During the deployment phase, the implanted backdoors in the victim models can be activated by the designed triggers to achieve the targeted attack. We evaluate our approach on two code understanding tasks and three code generation tasks over seven datasets. Extensive experimental results demonstrate that our approach effectively and stealthily attacks code-related downstream tasks.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114260980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models 多语言编码器和Seq2Seq模型的顺序预训练方法
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-14 DOI: 10.48550/arXiv.2306.08756
Saleh Soltan, Andrew Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza
{"title":"Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models","authors":"Saleh Soltan, Andrew Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza","doi":"10.48550/arXiv.2306.08756","DOIUrl":"https://doi.org/10.48550/arXiv.2306.08756","url":null,"abstract":"Pre-trained encoder-only and sequence-to-sequence (seq2seq) models each have advantages, however training both model types from scratch is computationally expensive. We explore recipes to improve pre-training efficiency by initializing one model from the other. (1) Extracting the encoder from a seq2seq model, we show it under-performs a Masked Language Modeling (MLM) encoder, particularly on sequence labeling tasks. Variations of masking during seq2seq training, reducing the decoder size, and continuing with a small amount of MLM training do not close the gap. (2) Conversely, using an encoder to warm-start seq2seq training, we show that by unfreezing the encoder partway through training, we can match task performance of a from-scratch seq2seq model. Overall, this two-stage approach is an efficient recipe to obtain both a multilingual encoder and a seq2seq model, matching the performance of training each model from scratch while reducing the total compute cost by 27%.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115501779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LiveChat: A Large-Scale Personalized Dialogue Dataset Automatically Constructed from Live Streaming LiveChat:一个从直播自动构建的大规模个性化对话数据集
Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-06-14 DOI: 10.48550/arXiv.2306.08401
Jingsheng Gao, Yixin Lian, Ziyi Zhou, Yuzhuo Fu, Baoyuan Wang
{"title":"LiveChat: A Large-Scale Personalized Dialogue Dataset Automatically Constructed from Live Streaming","authors":"Jingsheng Gao, Yixin Lian, Ziyi Zhou, Yuzhuo Fu, Baoyuan Wang","doi":"10.48550/arXiv.2306.08401","DOIUrl":"https://doi.org/10.48550/arXiv.2306.08401","url":null,"abstract":"Open-domain dialogue systems have made promising progress in recent years. While the state-of-the-art dialogue agents are built upon large-scale social media data and large pre-trained models, there is no guarantee these agents could also perform well in fast-growing scenarios, such as live streaming, due to the bounded transferability of pre-trained models and biased distributions of public datasets from Reddit and Weibo, etc. To improve the essential capability of responding and establish a benchmark in the live open-domain scenario, we introduce the LiveChat dataset, composed of 1.33 million real-life Chinese dialogues with almost 3800 average sessions across 351 personas and fine-grained profiles for each persona. LiveChat is automatically constructed by processing numerous live videos on the Internet and naturally falls within the scope of multi-party conversations, where the issues of Who says What to Whom should be considered. Therefore, we target two critical tasks of response modeling and addressee recognition and propose retrieval-based baselines grounded on advanced techniques. Experimental results have validated the positive effects of leveraging persona profiles and larger average sessions per persona. In addition, we also benchmark the transferability of advanced generation-based models on LiveChat and pose some future directions for current challenges.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"344 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123351644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信