{"title":"基于潜在空间压缩的文本摘要预训练模型的鲁棒微调方法","authors":"Ala Alam Falaki, R. Gras","doi":"10.1109/ICMLA55696.2022.00030","DOIUrl":null,"url":null,"abstract":"We proposed a technique to reduce the decoder’s number of parameters in a sequence-to-sequence (seq2seq) architecture for automatic text summarization. This approach uses a pre-trained Autoencoder (AE) trained on top of an encoder’s output to reduce its embedding dimension, which significantly reduces the summarizer model’s decoder size. Two experiments were performed to validate the idea: a custom seq2seq architecture with various pre-trained encoders and incorporating the approach in an encoder-decoder model (BART) for text summarization. Both studies showed promising results in terms of ROUGE score. However, the impressive outcome is the 54% decrease in the inference time and a 57% drop in GPU memory usage while fine-tuning with minimal quality loss (4.5% R1 score). It significantly reduces the hardware requirement to fine-tune large-scale pre-trained models. It is also shown that our approach can be combined with other network size reduction techniques (e.g. Distillation) to further reduce any encoder-decoder model parameters count. The implementation and checkpoints are available on GitHub.1","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Robust Approach to Fine-tune Pre-trained Transformer-based models for Text Summarization through Latent Space Compression\",\"authors\":\"Ala Alam Falaki, R. Gras\",\"doi\":\"10.1109/ICMLA55696.2022.00030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We proposed a technique to reduce the decoder’s number of parameters in a sequence-to-sequence (seq2seq) architecture for automatic text summarization. This approach uses a pre-trained Autoencoder (AE) trained on top of an encoder’s output to reduce its embedding dimension, which significantly reduces the summarizer model’s decoder size. Two experiments were performed to validate the idea: a custom seq2seq architecture with various pre-trained encoders and incorporating the approach in an encoder-decoder model (BART) for text summarization. Both studies showed promising results in terms of ROUGE score. However, the impressive outcome is the 54% decrease in the inference time and a 57% drop in GPU memory usage while fine-tuning with minimal quality loss (4.5% R1 score). It significantly reduces the hardware requirement to fine-tune large-scale pre-trained models. It is also shown that our approach can be combined with other network size reduction techniques (e.g. Distillation) to further reduce any encoder-decoder model parameters count. The implementation and checkpoints are available on GitHub.1\",\"PeriodicalId\":128160,\"journal\":{\"name\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA55696.2022.00030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Robust Approach to Fine-tune Pre-trained Transformer-based models for Text Summarization through Latent Space Compression
We proposed a technique to reduce the decoder’s number of parameters in a sequence-to-sequence (seq2seq) architecture for automatic text summarization. This approach uses a pre-trained Autoencoder (AE) trained on top of an encoder’s output to reduce its embedding dimension, which significantly reduces the summarizer model’s decoder size. Two experiments were performed to validate the idea: a custom seq2seq architecture with various pre-trained encoders and incorporating the approach in an encoder-decoder model (BART) for text summarization. Both studies showed promising results in terms of ROUGE score. However, the impressive outcome is the 54% decrease in the inference time and a 57% drop in GPU memory usage while fine-tuning with minimal quality loss (4.5% R1 score). It significantly reduces the hardware requirement to fine-tune large-scale pre-trained models. It is also shown that our approach can be combined with other network size reduction techniques (e.g. Distillation) to further reduce any encoder-decoder model parameters count. The implementation and checkpoints are available on GitHub.1