Sawittree Jumpathong, T. Theeramunkong, T. Supnithi, M. Okumura
{"title":"A Performance Analysis of Deep-Learning-Based Thai News Abstractive Summarization: Word Positions and Document Length","authors":"Sawittree Jumpathong, T. Theeramunkong, T. Supnithi, M. Okumura","doi":"10.1109/ICBIR54589.2022.9786413","DOIUrl":null,"url":null,"abstract":"This paper presents a performance analysis of deep-learning-based Thai news abstractive summarization. The analysis focuses on the position of the words in the original document that are generated into the summary. Also, the analysis includes the behavior of word generation of the system. Moreover, we analyse how the document length affects the performance of the models regarding word positions of the original document. The result of the experiment shows that the models generated the output summary by generating most words from the beginning part more than those from the reference summary about 1.79 times on the TR testing dataset and about 2.03 times on the TPBS testing dataset. Additionally, the models occasionally generated words that do not exist in the original document about 1.68% of word number of the summary on the TR testing dataset and about 0.88% of word number of the summary on the TBPS testing dataset. According to the result, it is found that the models generated words in the system summary is not consistent with words in the reference summary. In the document length, it is found that the models can summarize a short document better than a long document.","PeriodicalId":216904,"journal":{"name":"2022 7th International Conference on Business and Industrial Research (ICBIR)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Business and Industrial Research (ICBIR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBIR54589.2022.9786413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper presents a performance analysis of deep-learning-based Thai news abstractive summarization. The analysis focuses on the position of the words in the original document that are generated into the summary. Also, the analysis includes the behavior of word generation of the system. Moreover, we analyse how the document length affects the performance of the models regarding word positions of the original document. The result of the experiment shows that the models generated the output summary by generating most words from the beginning part more than those from the reference summary about 1.79 times on the TR testing dataset and about 2.03 times on the TPBS testing dataset. Additionally, the models occasionally generated words that do not exist in the original document about 1.68% of word number of the summary on the TR testing dataset and about 0.88% of word number of the summary on the TBPS testing dataset. According to the result, it is found that the models generated words in the system summary is not consistent with words in the reference summary. In the document length, it is found that the models can summarize a short document better than a long document.