AUTOMATIC TEXT SUMMARIZATION: PROBLEMS AND PERSPECTIVES

Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ» Pub Date : 2023-06-22 DOI:10.25264/2519-2558-2023-17(85)-96-101

Tetiana Ugryn

{"title":"AUTOMATIC TEXT SUMMARIZATION: PROBLEMS AND PERSPECTIVES","authors":"Tetiana Ugryn","doi":"10.25264/2519-2558-2023-17(85)-96-101","DOIUrl":null,"url":null,"abstract":"The present paper focusses on the automatic text summarization (AS), the analysis of linguistic problems related to it and the ways to overcome them, as well as on the perspectives of using some natural language processing computer programs. The author carries out a comparative analysis of two AS programs, MSWord2003 and Pertinence Summarizer, for literary, journalistic and scientific texts. The chosen methodology of comparative analysis allows not only to single out the peculiarities and limitations of each program, but also to make some general conclusions about the problems existing in the process of automatic summarization. The analysis of source texts and results of AS presented in the paper is focused on the correlation between the text genre and the process/result of AS. The analysis does not take into account such factors influencing the quality of summary as the length of the original text, the original language, the subject, etc. The primary hypothesis of the study was the assertion that the quality of automatic summarization of a text directly depends on the genre of this text. The obtained results made it possible to confirm this hypothesis and highlight the interdependence between the level of formalism in the text, which can be explained by its genre, and the pertinence of the summary. The conducted research showed that both AS programs are based, first of all, on morphological and, to a lesser extent, on morpho-syntaxic analysis of the source text. Furthermore, the issue of processing the implicit information available in the text, at the semantic and pragmatic level in particular, still seems unresolved. One of the possible ways to overcome this problem is the dynamic summarization of the text, which necessitates broader participation and involvement of the program user in the process of automatic summarization.","PeriodicalId":237537,"journal":{"name":"Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ»","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ»","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25264/2519-2558-2023-17(85)-96-101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The present paper focusses on the automatic text summarization (AS), the analysis of linguistic problems related to it and the ways to overcome them, as well as on the perspectives of using some natural language processing computer programs. The author carries out a comparative analysis of two AS programs, MSWord2003 and Pertinence Summarizer, for literary, journalistic and scientific texts. The chosen methodology of comparative analysis allows not only to single out the peculiarities and limitations of each program, but also to make some general conclusions about the problems existing in the process of automatic summarization. The analysis of source texts and results of AS presented in the paper is focused on the correlation between the text genre and the process/result of AS. The analysis does not take into account such factors influencing the quality of summary as the length of the original text, the original language, the subject, etc. The primary hypothesis of the study was the assertion that the quality of automatic summarization of a text directly depends on the genre of this text. The obtained results made it possible to confirm this hypothesis and highlight the interdependence between the level of formalism in the text, which can be explained by its genre, and the pertinence of the summary. The conducted research showed that both AS programs are based, first of all, on morphological and, to a lesser extent, on morpho-syntaxic analysis of the source text. Furthermore, the issue of processing the implicit information available in the text, at the semantic and pragmatic level in particular, still seems unresolved. One of the possible ways to overcome this problem is the dynamic summarization of the text, which necessitates broader participation and involvement of the program user in the process of automatic summarization.

查看原文本刊更多论文

自动文本摘要:问题和观点

本文对自动文本摘要进行了研究，分析了与自动文本摘要相关的语言问题及其解决方法，并对使用一些自然语言处理计算机程序的前景进行了展望。本文对MSWord2003和Pertinence Summarizer这两种AS程序进行了比较分析，分别用于文学、新闻和科学文本。选择比较分析的方法，不仅可以挑出每个程序的特点和局限性，而且可以对自动总结过程中存在的问题得出一些一般性的结论。本文对英语辅助阅读的源文本和结果进行了分析，重点分析了语篇类型与英语辅助阅读过程/结果之间的关系。分析没有考虑到影响摘要质量的因素，如原文长度、原文语言、主题等。该研究的主要假设是断言文本自动摘要的质量直接取决于文本的类型。所获得的结果可以证实这一假设，并突出文本中形式主义水平(可以通过其类型来解释)与总结的针对性之间的相互依存关系。所进行的研究表明，这两个AS程序首先基于对源文本的形态学分析，并在较小程度上基于对源文本的形态学-句法分析。此外，处理文本中可用的隐含信息的问题，特别是在语义和语用层面，似乎仍然没有解决。克服这一问题的可能途径之一是文本的动态摘要，这需要程序用户在自动摘要过程中更广泛的参与和介入。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ»

自引率

0.00%

发文量