Shu Zhao;Yuanfang Cheng;Yanping Zhang;Jie Chen;Zhen Duan;Yang Sun;Xinyuan Wang
{"title":"HyFit:混合微调与不同采样的抽象总结","authors":"Shu Zhao;Yuanfang Cheng;Yanping Zhang;Jie Chen;Zhen Duan;Yang Sun;Xinyuan Wang","doi":"10.1109/TBDATA.2024.3387311","DOIUrl":null,"url":null,"abstract":"Abstractive summarization has made significant progress in recent years, which aims to generate a concise and coherent summary that contains the most important facts from the source document. Current fine-tuning approaches based on pre-training models typically rely on autoregressive and maximum likelihood estimation, which may result in inconsistent historical distributions generated during the training and inference stages, i.e., exposure bias problem. To alleviate this problem, we propose a hybrid fine-tuning model(HyFit), which combines contrastive learning and reinforcement learning in a diverse sampling space. Firstly, we introduce reparameterization and probability-based sampling methods to generate a set of summary candidates called candidates bank, which improves the diversity and quality of the decoding sampling space and incorporates the potential for uncertainty. Secondly, hybrid fine-tuning with sampled candidates bank, upweighting confident summaries and downweighting unconfident ones. Experiments demonstrate that HyFit significantly outperforms the state-of-the-art models on SAMSum and DialogSum. HyFit also shows good performance on low-resource summarization, on DialogSum dataset, using only approximate 8% of the examples exceed the performance of the base model trained on all examples.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 3","pages":"1054-1065"},"PeriodicalIF":7.5000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HyFit: Hybrid Fine-Tuning With Diverse Sampling for Abstractive Summarization\",\"authors\":\"Shu Zhao;Yuanfang Cheng;Yanping Zhang;Jie Chen;Zhen Duan;Yang Sun;Xinyuan Wang\",\"doi\":\"10.1109/TBDATA.2024.3387311\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstractive summarization has made significant progress in recent years, which aims to generate a concise and coherent summary that contains the most important facts from the source document. Current fine-tuning approaches based on pre-training models typically rely on autoregressive and maximum likelihood estimation, which may result in inconsistent historical distributions generated during the training and inference stages, i.e., exposure bias problem. To alleviate this problem, we propose a hybrid fine-tuning model(HyFit), which combines contrastive learning and reinforcement learning in a diverse sampling space. Firstly, we introduce reparameterization and probability-based sampling methods to generate a set of summary candidates called candidates bank, which improves the diversity and quality of the decoding sampling space and incorporates the potential for uncertainty. Secondly, hybrid fine-tuning with sampled candidates bank, upweighting confident summaries and downweighting unconfident ones. Experiments demonstrate that HyFit significantly outperforms the state-of-the-art models on SAMSum and DialogSum. HyFit also shows good performance on low-resource summarization, on DialogSum dataset, using only approximate 8% of the examples exceed the performance of the base model trained on all examples.\",\"PeriodicalId\":13106,\"journal\":{\"name\":\"IEEE Transactions on Big Data\",\"volume\":\"11 3\",\"pages\":\"1054-1065\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10496256/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10496256/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
HyFit: Hybrid Fine-Tuning With Diverse Sampling for Abstractive Summarization
Abstractive summarization has made significant progress in recent years, which aims to generate a concise and coherent summary that contains the most important facts from the source document. Current fine-tuning approaches based on pre-training models typically rely on autoregressive and maximum likelihood estimation, which may result in inconsistent historical distributions generated during the training and inference stages, i.e., exposure bias problem. To alleviate this problem, we propose a hybrid fine-tuning model(HyFit), which combines contrastive learning and reinforcement learning in a diverse sampling space. Firstly, we introduce reparameterization and probability-based sampling methods to generate a set of summary candidates called candidates bank, which improves the diversity and quality of the decoding sampling space and incorporates the potential for uncertainty. Secondly, hybrid fine-tuning with sampled candidates bank, upweighting confident summaries and downweighting unconfident ones. Experiments demonstrate that HyFit significantly outperforms the state-of-the-art models on SAMSum and DialogSum. HyFit also shows good performance on low-resource summarization, on DialogSum dataset, using only approximate 8% of the examples exceed the performance of the base model trained on all examples.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.