AUTOMATIC TEXT SUMMARIZATION: PROBLEMS AND PERSPECTIVES

Tetiana Ugryn
{"title":"AUTOMATIC TEXT SUMMARIZATION: PROBLEMS AND PERSPECTIVES","authors":"Tetiana Ugryn","doi":"10.25264/2519-2558-2023-17(85)-96-101","DOIUrl":null,"url":null,"abstract":"The present paper focusses on the automatic text summarization (AS), the analysis of linguistic problems related to it and the ways to overcome them, as well as on the perspectives of using some natural language processing computer programs. The author carries out a comparative analysis of two AS programs, MSWord2003 and Pertinence Summarizer, for literary, journalistic and scientific texts. The chosen methodology of comparative analysis allows not only to single out the peculiarities and limitations of each program, but also to make some general conclusions about the problems existing in the process of automatic summarization. The analysis of source texts and results of AS presented in the paper is focused on the correlation between the text genre and the process/result of AS. The analysis does not take into account such factors influencing the quality of summary as the length of the original text, the original language, the subject, etc. The primary hypothesis of the study was the assertion that the quality of automatic summarization of a text directly depends on the genre of this text. The obtained results made it possible to confirm this hypothesis and highlight the interdependence between the level of formalism in the text, which can be explained by its genre, and the pertinence of the summary. The conducted research showed that both AS programs are based, first of all, on morphological and, to a lesser extent, on morpho-syntaxic analysis of the source text. Furthermore, the issue of processing the implicit information available in the text, at the semantic and pragmatic level in particular, still seems unresolved. One of the possible ways to overcome this problem is the dynamic summarization of the text, which necessitates broader participation and involvement of the program user in the process of automatic summarization.","PeriodicalId":237537,"journal":{"name":"Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ»","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ»","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25264/2519-2558-2023-17(85)-96-101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The present paper focusses on the automatic text summarization (AS), the analysis of linguistic problems related to it and the ways to overcome them, as well as on the perspectives of using some natural language processing computer programs. The author carries out a comparative analysis of two AS programs, MSWord2003 and Pertinence Summarizer, for literary, journalistic and scientific texts. The chosen methodology of comparative analysis allows not only to single out the peculiarities and limitations of each program, but also to make some general conclusions about the problems existing in the process of automatic summarization. The analysis of source texts and results of AS presented in the paper is focused on the correlation between the text genre and the process/result of AS. The analysis does not take into account such factors influencing the quality of summary as the length of the original text, the original language, the subject, etc. The primary hypothesis of the study was the assertion that the quality of automatic summarization of a text directly depends on the genre of this text. The obtained results made it possible to confirm this hypothesis and highlight the interdependence between the level of formalism in the text, which can be explained by its genre, and the pertinence of the summary. The conducted research showed that both AS programs are based, first of all, on morphological and, to a lesser extent, on morpho-syntaxic analysis of the source text. Furthermore, the issue of processing the implicit information available in the text, at the semantic and pragmatic level in particular, still seems unresolved. One of the possible ways to overcome this problem is the dynamic summarization of the text, which necessitates broader participation and involvement of the program user in the process of automatic summarization.
自动文本摘要:问题和观点
本文对自动文本摘要进行了研究,分析了与自动文本摘要相关的语言问题及其解决方法,并对使用一些自然语言处理计算机程序的前景进行了展望。本文对MSWord2003和Pertinence Summarizer这两种AS程序进行了比较分析,分别用于文学、新闻和科学文本。选择比较分析的方法,不仅可以挑出每个程序的特点和局限性,而且可以对自动总结过程中存在的问题得出一些一般性的结论。本文对英语辅助阅读的源文本和结果进行了分析,重点分析了语篇类型与英语辅助阅读过程/结果之间的关系。分析没有考虑到影响摘要质量的因素,如原文长度、原文语言、主题等。该研究的主要假设是断言文本自动摘要的质量直接取决于文本的类型。所获得的结果可以证实这一假设,并突出文本中形式主义水平(可以通过其类型来解释)与总结的针对性之间的相互依存关系。所进行的研究表明,这两个AS程序首先基于对源文本的形态学分析,并在较小程度上基于对源文本的形态学-句法分析。此外,处理文本中可用的隐含信息的问题,特别是在语义和语用层面,似乎仍然没有解决。克服这一问题的可能途径之一是文本的动态摘要,这需要程序用户在自动摘要过程中更广泛的参与和介入。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信