{"title":"用于阿拉伯语抽象摘要的AraT5变压器的微调","authors":"Yasmin Einieh, Amal Almansour, A. Jamal","doi":"10.1109/CICN56167.2022.10008272","DOIUrl":null,"url":null,"abstract":"Creating an abstractive summary of a document by rephrasing its most crucial sentences is a challenging but crucial task in natural language processing. The field witnessed a remarkable development with deep learning techniques, especially with the emergence of pre-trained models that achieved the best results by training them on very large data and trained later on specific tasks. In this paper, we used the T5 model, which achieved results that are considered the best in different tasks of natural language processing. AraT5 is the newly launched Arabic language version, as we have worked on fine-tuning it on a dataset of 267,000 Arabic articles. The model was evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU and the results were 0.494 0.339 0.469 0.4224, respectively. In addition, the AraT5 model is superior to other state-of-the-art research studies using the sequence-to-sequence model.","PeriodicalId":287589,"journal":{"name":"2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fine Tuning an AraT5 Transformer for Arabic Abstractive Summarization\",\"authors\":\"Yasmin Einieh, Amal Almansour, A. Jamal\",\"doi\":\"10.1109/CICN56167.2022.10008272\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Creating an abstractive summary of a document by rephrasing its most crucial sentences is a challenging but crucial task in natural language processing. The field witnessed a remarkable development with deep learning techniques, especially with the emergence of pre-trained models that achieved the best results by training them on very large data and trained later on specific tasks. In this paper, we used the T5 model, which achieved results that are considered the best in different tasks of natural language processing. AraT5 is the newly launched Arabic language version, as we have worked on fine-tuning it on a dataset of 267,000 Arabic articles. The model was evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU and the results were 0.494 0.339 0.469 0.4224, respectively. In addition, the AraT5 model is superior to other state-of-the-art research studies using the sequence-to-sequence model.\",\"PeriodicalId\":287589,\"journal\":{\"name\":\"2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CICN56167.2022.10008272\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CICN56167.2022.10008272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fine Tuning an AraT5 Transformer for Arabic Abstractive Summarization
Creating an abstractive summary of a document by rephrasing its most crucial sentences is a challenging but crucial task in natural language processing. The field witnessed a remarkable development with deep learning techniques, especially with the emergence of pre-trained models that achieved the best results by training them on very large data and trained later on specific tasks. In this paper, we used the T5 model, which achieved results that are considered the best in different tasks of natural language processing. AraT5 is the newly launched Arabic language version, as we have worked on fine-tuning it on a dataset of 267,000 Arabic articles. The model was evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU and the results were 0.494 0.339 0.469 0.4224, respectively. In addition, the AraT5 model is superior to other state-of-the-art research studies using the sequence-to-sequence model.