Determining the Authorship of a Ukrainian-Language Literary Text by Means of Artificial Intelligence from Ultra-Short Excerpts

O. P. Ivanov, V. Shynkarenko, V. Skalozub, A. A. Kosolapov
{"title":"Determining the Authorship of a Ukrainian-Language Literary Text by Means of Artificial Intelligence from Ultra-Short Excerpts","authors":"O. P. Ivanov, V. Shynkarenko, V. Skalozub, A. A. Kosolapov","doi":"10.15802/stp2023/288289","DOIUrl":null,"url":null,"abstract":"Purpose. The intelligent search engine Bing can be used as a method and a means of determining the author of a Ukrainian-language test. Bing helps to find information about a text fragment and its author, but the search results may be inaccurate or incomplete. The main purpose of the paper is to study the effectiveness of establishing the authorship of literary texts by state-of-the-art artificial intelligence tools based on ultra-short excerpts. Methodology. Ten Ukrainian authors with a rich body of fiction reflecting various aspects of Ukrainian culture and history were selected, as well as random fragments of 3–7 words each from different works of these authors. An experiment was conducted to determine the authorship of 2,000 fragments. Findings. Using the Python programming language and the skpy package, we developed software that sends questions and receives answers from the Bing bot built into Microsoft Skype. The answers were checked for the name of the author of the phrase and the corresponding title of the work. According to the results, Ivan Franko has the highest percentage of answers where the author's name was mentioned (65%), and Oleksandr Dovzhenko has the lowest result (23%). The answers were analyzed by the length of the fragments. Of course, the longer the length of a text fragment, the greater the likelihood of accurately identifying its authorship. Features of the author's style are manifested in 20–40 % of short fragments. The remaining 60–80% may be commonly used language constructions that the author relayed from the external environment. Originality. In this work, for the first time, the method of checking the authorship of fragments of Ukrainian-language text using the Bing bot with artificial intelligence is presented. A comparative analysis was performed and experiments were given to determine the authorship of short fragments of 3–7 words. It has been established that even quite small fragments of the text have signs characteristic of the original style of the author of artistic works. Practical value. It has been determined to what extent experts in determining the authorship of natural language texts can rely on existing state-of-the-art artificial intelligence tools in combination with an extensive database of texts in the Internet space.","PeriodicalId":338885,"journal":{"name":"Science and Transport Progress","volume":"78 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science and Transport Progress","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15802/stp2023/288289","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose. The intelligent search engine Bing can be used as a method and a means of determining the author of a Ukrainian-language test. Bing helps to find information about a text fragment and its author, but the search results may be inaccurate or incomplete. The main purpose of the paper is to study the effectiveness of establishing the authorship of literary texts by state-of-the-art artificial intelligence tools based on ultra-short excerpts. Methodology. Ten Ukrainian authors with a rich body of fiction reflecting various aspects of Ukrainian culture and history were selected, as well as random fragments of 3–7 words each from different works of these authors. An experiment was conducted to determine the authorship of 2,000 fragments. Findings. Using the Python programming language and the skpy package, we developed software that sends questions and receives answers from the Bing bot built into Microsoft Skype. The answers were checked for the name of the author of the phrase and the corresponding title of the work. According to the results, Ivan Franko has the highest percentage of answers where the author's name was mentioned (65%), and Oleksandr Dovzhenko has the lowest result (23%). The answers were analyzed by the length of the fragments. Of course, the longer the length of a text fragment, the greater the likelihood of accurately identifying its authorship. Features of the author's style are manifested in 20–40 % of short fragments. The remaining 60–80% may be commonly used language constructions that the author relayed from the external environment. Originality. In this work, for the first time, the method of checking the authorship of fragments of Ukrainian-language text using the Bing bot with artificial intelligence is presented. A comparative analysis was performed and experiments were given to determine the authorship of short fragments of 3–7 words. It has been established that even quite small fragments of the text have signs characteristic of the original style of the author of artistic works. Practical value. It has been determined to what extent experts in determining the authorship of natural language texts can rely on existing state-of-the-art artificial intelligence tools in combination with an extensive database of texts in the Internet space.
通过人工智能从超短节选确定乌克兰语文学文本的作者身份
目的智能搜索引擎 Bing 可用作确定乌克兰语测试作者的方法和手段。Bing 可帮助查找文本片段及其作者的相关信息,但搜索结果可能不准确或不完整。本文的主要目的是研究最先进的人工智能工具根据超短节选确定文学文本作者的有效性。研究方法。选取了十位乌克兰作家的大量反映乌克兰文化和历史各个方面的小说,并从这些作家的不同作品中随机选取了每篇 3-7 个字的片段。通过实验确定了 2,000 个片段的作者。研究结果我们使用 Python 编程语言和 skpy 软件包开发了一款软件,可从微软 Skype 内置的必应机器人发送问题并接收答案。我们检查了答案中的短语作者姓名和相应的作品名称。结果显示,伊万-弗兰科(Ivan Franko)的回答中提及作者姓名的比例最高(65%),而奥列克桑德-多夫琴科(Oleksandr Dovzhenko)的回答中提及作者姓名的比例最低(23%)。根据片段的长度对答案进行了分析。当然,文本片段的长度越长,准确确定作者的可能性就越大。作者的风格特征在 20-40% 的短文片段中有所体现。其余 60-80% 可能是作者从外部环境中转述的常用语言结构。原创性。在这项工作中,首次提出了利用人工智能必应机器人检查乌克兰语文本片段作者身份的方法。通过对比分析和实验,确定了 3-7 个单词的短小片段的作者身份。结果表明,即使是相当小的文本片段也具有艺术作品作者原始风格的特征。实用价值。已经确定了专家在确定自然语言文本的作者时,在多大程度上可以依靠现有的最先进的人工智能工具,并结合互联网空间中广泛的文本数据库。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信