Methods and software for significant indicators determination of the natural language texts author profile

V. Shynkarenko, I.M. Demydovych
{"title":"Methods and software for significant indicators determination of the natural language texts author profile","authors":"V. Shynkarenko, I.M. Demydovych","doi":"10.15407/pp2023.03.022","DOIUrl":null,"url":null,"abstract":"Methods for the formation and optimization of author profiles are presented. The author profile is an image – a vector in a multidimensional space, which components are author’s texts measurements by a number of methods based on 4-grams, stemming, recurrence analysis and formal stochastic grammar. The author’s profile is a model of his language, including vocabulary, sentence syntax features. A comparative analysis of each of the methods effectiveness is carried out. By means of the genetic algorithm, a reduced profile of the author is formed. Insignificant indicators are excluded, which allows to reduce their number by 20%. The reduced author’s profile contains attributes that are significant for this author and is an effective attribution of a particular author.","PeriodicalId":508096,"journal":{"name":"PROBLEMS IN PROGRAMMING","volume":"243 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PROBLEMS IN PROGRAMMING","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15407/pp2023.03.022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Methods for the formation and optimization of author profiles are presented. The author profile is an image – a vector in a multidimensional space, which components are author’s texts measurements by a number of methods based on 4-grams, stemming, recurrence analysis and formal stochastic grammar. The author’s profile is a model of his language, including vocabulary, sentence syntax features. A comparative analysis of each of the methods effectiveness is carried out. By means of the genetic algorithm, a reduced profile of the author is formed. Insignificant indicators are excluded, which allows to reduce their number by 20%. The reduced author’s profile contains attributes that are significant for this author and is an effective attribution of a particular author.
确定自然语言文本作者简介重要指标的方法和软件
本文介绍了作者简介的形成和优化方法。作者简介是一幅图像--多维空间中的一个矢量,其组成部分是通过基于 4-grams、词干、递归分析和形式随机语法的多种方法测量的作者文本。作者简介是作者的语言模型,包括词汇、句子语法特征。对每种方法的有效性都进行了比较分析。通过遗传算法,形成了一个精简的作者简介。不重要的指标被排除在外,从而使指标数量减少了 20%。缩减后的作者简介包含了对该作者具有重要意义的属性,是对特定作者的有效归属。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信