Nabil Burmani, H. Alami, Said Lafkiar, Mohamed Zouitni, Mohammed Taleb, Noureddine En Nahnahi
{"title":"基于图的阿拉伯语文本摘要方法","authors":"Nabil Burmani, H. Alami, Said Lafkiar, Mohamed Zouitni, Mohammed Taleb, Noureddine En Nahnahi","doi":"10.1109/ISCV54655.2022.9806127","DOIUrl":null,"url":null,"abstract":"The amount of Arabic textual data is growing tremendously, hence the need to reduce it with the aim to be easier to use while keeping only the necessary from the original text. In this regard, several natural language processing researchers are working on the generation of extractive and abstractive summary tools to achieve this aim. In this work, we explore an extractive approach to realize a generative model of summaries for Arabic single-documents. We focus on the use of graph-based methods to find the most important sentences and then extract them with a variety of text representation methods such as TF-IDF, fastText, and Word2Vec-, similarity measures, and graph ranking methods. To test our system we used the EASC (Essex Arabic Summaries Corpus) and the ROUGE metric to evaluate it. The results obtained show that the TF-IDF representation, the ranking by PageRank, and the use of cosine similarity achieve good performance, which can generate a high-quality summary.","PeriodicalId":426665,"journal":{"name":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Graph based method for Arabic text summarization\",\"authors\":\"Nabil Burmani, H. Alami, Said Lafkiar, Mohamed Zouitni, Mohammed Taleb, Noureddine En Nahnahi\",\"doi\":\"10.1109/ISCV54655.2022.9806127\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The amount of Arabic textual data is growing tremendously, hence the need to reduce it with the aim to be easier to use while keeping only the necessary from the original text. In this regard, several natural language processing researchers are working on the generation of extractive and abstractive summary tools to achieve this aim. In this work, we explore an extractive approach to realize a generative model of summaries for Arabic single-documents. We focus on the use of graph-based methods to find the most important sentences and then extract them with a variety of text representation methods such as TF-IDF, fastText, and Word2Vec-, similarity measures, and graph ranking methods. To test our system we used the EASC (Essex Arabic Summaries Corpus) and the ROUGE metric to evaluate it. The results obtained show that the TF-IDF representation, the ranking by PageRank, and the use of cosine similarity achieve good performance, which can generate a high-quality summary.\",\"PeriodicalId\":426665,\"journal\":{\"name\":\"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"volume\":\"103 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCV54655.2022.9806127\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCV54655.2022.9806127","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The amount of Arabic textual data is growing tremendously, hence the need to reduce it with the aim to be easier to use while keeping only the necessary from the original text. In this regard, several natural language processing researchers are working on the generation of extractive and abstractive summary tools to achieve this aim. In this work, we explore an extractive approach to realize a generative model of summaries for Arabic single-documents. We focus on the use of graph-based methods to find the most important sentences and then extract them with a variety of text representation methods such as TF-IDF, fastText, and Word2Vec-, similarity measures, and graph ranking methods. To test our system we used the EASC (Essex Arabic Summaries Corpus) and the ROUGE metric to evaluate it. The results obtained show that the TF-IDF representation, the ranking by PageRank, and the use of cosine similarity achieve good performance, which can generate a high-quality summary.