Conference on Empirical Methods in Natural Language Processing最新文献_第9页

Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model 奖励增强解码：利用单向奖励模型高效生成受控文本

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-10-14 DOI: 10.18653/v1/2023.emnlp-main.721

H. Deng, Colin Raffel

引用次数: 0

Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis 学习语言引导的自适应超模态表征，用于多模态情感分析

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-10-09 DOI: 10.18653/v1/2023.emnlp-main.49

Haoyu Zhang, Yu Wang, Guanghao Yin, Kejun Liu, Yuanyuan Liu, Tianshu Yu

引用次数: 0

Task-Adaptive Tokenization: Enhancing Long-Form Text Generation Efficacy in Mental Health and Beyond 任务自适应标记化：提高心理健康及其他领域的长格式文本生成效率

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-10-09 DOI: 10.18653/v1/2023.emnlp-main.944

Siyang Liu, Naihao Deng, Sahand Sabour, Yilin Jia, Minlie Huang, Rada Mihalcea

引用次数: 0

DocumentNet: Bridging the Data Gap in Document Pre-training 文档网：弥合文档预培训中的数据鸿沟

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-06-15 DOI: 10.18653/v1/2023.emnlp-industry.66

Lijun Yu, Jin Miao, Xiaoyu Sun, Jiayi Chen, A. Hauptmann, H. Dai, Wei Wei

{"title":"DocumentNet: Bridging the Data Gap in Document Pre-training","authors":"Lijun Yu, Jin Miao, Xiaoyu Sun, Jiayi Chen, A. Hauptmann, H. Dai, Wei Wei","doi":"10.18653/v1/2023.emnlp-industry.66","DOIUrl":"https://doi.org/10.18653/v1/2023.emnlp-industry.66","url":null,"abstract":"Document understanding tasks, in particular, Visually-rich Document Entity Retrieval (VDER), have gained significant attention in recent years thanks to their broad applications in enterprise AI. However, publicly available data have been scarce for these tasks due to strict privacy constraints and high annotation costs. To make things worse, the non-overlapping entity spaces from different datasets hinder the knowledge transfer between document types. In this paper, we propose a method to collect massive-scale and weakly labeled data from the web to benefit the training of VDER models. The collected dataset, named DocumentNet, does not depend on specific document types or entity sets, making it universally applicable to all VDER tasks. The current DocumentNet consists of 30M documents spanning nearly 400 document types organized in a four-level ontology. Experiments on a set of broadly adopted VDER tasks show significant improvements when DocumentNet is incorporated into the pre-training for both classic and few-shot learning settings. With the recent emergence of large language models (LLMs), DocumentNet provides a large data source to extend their multi-modal capabilities for VDER.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"16 1","pages":"707-722"},"PeriodicalIF":0.0,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139369821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring the Impact of Model Scaling on Parameter-Efficient Tuning 探索模型缩放对参数效率调整的影响

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-06-04 DOI: 10.18653/v1/2023.emnlp-main.931

Yusheng Su, Chi-Min Chan, Jiali Cheng, Yujia Qin, Yankai Lin, Shengding Hu, Zonghan Yang, Ning Ding, Zhiyuan Liu, Maosong Sun

{"title":"Exploring the Impact of Model Scaling on Parameter-Efficient Tuning","authors":"Yusheng Su, Chi-Min Chan, Jiali Cheng, Yujia Qin, Yankai Lin, Shengding Hu, Zonghan Yang, Ning Ding, Zhiyuan Liu, Maosong Sun","doi":"10.18653/v1/2023.emnlp-main.931","DOIUrl":"https://doi.org/10.18653/v1/2023.emnlp-main.931","url":null,"abstract":"Parameter-efficient tuning (PET) methods can effectively drive extremely large pre-trained language models (PLMs) by training only minimal parameters. Different PET methods utilize different manually designed tunable modules. In small PLMs, there are usually noticeable performance differences among PET methods. Nevertheless, as the model scale increases, the performance differences become marginal. Hence, we hypothesize that model scaling mitigates the impact of design differences on PET methods. To investigate this hypothesis, we introduce a more flexible PET method called Arbitrary PET (APET) method. The APET method is compatible with a tunable module, which consists of any number of parameters distributed in arbitrary positions. Then, we utilize it and conduct experiments on 11 NLP tasks across 3 representative PLMs. Our investigations reveal that model scaling (1) mitigates the effects of the positions of tunable parameters on performance, and (2) enables tuning methods to achieve performance comparable to full-parameter fine-tuning by optimizing fewer tunable parameters. Intriguingly, we also observe that tuning methods optimize the similar number of tunable parameters to exceed random guess performance on different tasks. We collectively discuss this phenomenon and the two aforementioned findings from an optimization perspective to understand the underlying mechanisms. These conclusions enhance our understanding of the impact of model scaling on PET and assist in designing more effective and efficient PET methods for PLMs of different scales. The source code can be obtained from this GitHub repository: url{https://github.com/yushengsu-thu/PET_Scaling}.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"11 1","pages":"15062-15078"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139370926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating Emotion Arcs Across Languages: Bridging the Global Divide in Sentiment Analysis 评估跨语言的情感弧线：弥合情感分析中的全球鸿沟

Conference on Empirical Methods in Natural Language Processing Pub Date : 2023-06-03 DOI: 10.18653/v1/2023.findings-emnlp.271

D. Teodorescu, Saif M. Mohammad

{"title":"Evaluating Emotion Arcs Across Languages: Bridging the Global Divide in Sentiment Analysis","authors":"D. Teodorescu, Saif M. Mohammad","doi":"10.18653/v1/2023.findings-emnlp.271","DOIUrl":"https://doi.org/10.18653/v1/2023.findings-emnlp.271","url":null,"abstract":"Emotion arcs capture how an individual (or a population) feels over time. They are widely used in industry and research; however, there is little work on evaluating the automatically generated arcs. This is because of the difficulty of establishing the true (gold) emotion arc. Our work, for the first time, systematically and quantitatively evaluates automatically generated emotion arcs. We also compare two common ways of generating emotion arcs: Machine-Learning (ML) models and Lexicon-Only (LexO) methods. By running experiments on 18 diverse datasets in 9 languages, we show that despite being markedly poor at instance level emotion classification, LexO methods are highly accurate at generating emotion arcs when aggregating information from hundreds of instances. We also show, through experiments on six indigenous African languages, as well as Arabic, and Spanish, that automatic translations of English emotion lexicons can be used to generate high-quality emotion arcs in less-resource languages. This opens up avenues for work on emotions in languages from around the world; which is crucial for commerce, public policy, and health research in service of speakers often left behind. Code and resources: https://github.com/dteodore/EmotionArcs","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"8 1","pages":"4124-4137"},"PeriodicalIF":0.0,"publicationDate":"2023-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139370977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0