Hannatu K. Ali, Luwam Major Kefali, Shantipriya Parida, S. Dash
{"title":"Amharic ATS - A Comparison Between Graph Based and Statistical Based Approach using Rouge Metric and Human Evaluation","authors":"Hannatu K. Ali, Luwam Major Kefali, Shantipriya Parida, S. Dash","doi":"10.1109/OTCON56053.2023.10114029","DOIUrl":null,"url":null,"abstract":"This paper presents a study into existing Automatic Text Summarization models applied on the Amharic languagea widely used and spoken language in Ethiopia. With more than 40 million speakers across the world, including the diaspora, it is of significance to have a mechanism where large Amharic texts can be condensed into understandable and short paragraphs. The models that have been implemented and used previously have shown great results and promise, especially the TextRank algorithm, which has been studied in this paper, along with TF-IDF and Cosine Similarity algorithms. Our paper mainly concentrated on the evaluation aspect of summarized Amharic texts with human summaries, which are likely to have more depth and context. The study compares and contrasts human generated summaries with machine generated ones, on the same text. The evaluation comprised of human evaluation and the Rouge Metrics. The results in both cases signified TextRank, a graph-based approach, to lead to optimal summaries.","PeriodicalId":265966,"journal":{"name":"2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/OTCON56053.2023.10114029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents a study into existing Automatic Text Summarization models applied on the Amharic languagea widely used and spoken language in Ethiopia. With more than 40 million speakers across the world, including the diaspora, it is of significance to have a mechanism where large Amharic texts can be condensed into understandable and short paragraphs. The models that have been implemented and used previously have shown great results and promise, especially the TextRank algorithm, which has been studied in this paper, along with TF-IDF and Cosine Similarity algorithms. Our paper mainly concentrated on the evaluation aspect of summarized Amharic texts with human summaries, which are likely to have more depth and context. The study compares and contrasts human generated summaries with machine generated ones, on the same text. The evaluation comprised of human evaluation and the Rouge Metrics. The results in both cases signified TextRank, a graph-based approach, to lead to optimal summaries.