Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu
{"title":"基于主题感知和对比学习的新型抽象摘要模型","authors":"Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu","doi":"10.1007/s13042-024-02263-8","DOIUrl":null,"url":null,"abstract":"<p>The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"48 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel abstractive summarization model based on topic-aware and contrastive learning\",\"authors\":\"Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu\",\"doi\":\"10.1007/s13042-024-02263-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.</p>\",\"PeriodicalId\":51327,\"journal\":{\"name\":\"International Journal of Machine Learning and Cybernetics\",\"volume\":\"48 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Machine Learning and Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s13042-024-02263-8\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02263-8","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A novel abstractive summarization model based on topic-aware and contrastive learning
The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.
期刊介绍:
Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data.
The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC.
Key research areas to be covered by the journal include:
Machine Learning for modeling interactions between systems
Pattern Recognition technology to support discovery of system-environment interaction
Control of system-environment interactions
Biochemical interaction in biological and biologically-inspired systems
Learning for improvement of communication schemes between systems