Algorithms and software for verification of scientific and technical text documents

Hlukhov Valerii S., Sydorko Dmytro S.
{"title":"Algorithms and software for verification of scientific and technical text documents","authors":"Hlukhov Valerii S., Sydorko Dmytro S.","doi":"10.15276/aait.06.2023.21","DOIUrl":null,"url":null,"abstract":"The work provides a solution to the problem of verifying the design (formatting) of scientific and technical documents for compliance with the requirements of regulatory documents (the problem of document verification). The basis of the check is the analysis of the styles of the Word text editor, which are used to design the paragraphs of the document under study. For eachelement of the document (headings, annotations, main text, figures, signatures under figures, list of references and others) a reference style of their design was developed. Together, these styles form the set of allowed styles. There can be many sets of allowed styles, each edition has its own set of styles. Only the administrator has access to each of the sets, which can create new styles, new sets, and edit both individual styles and individual sets. Due to the peculiarities of style parsing, the document is treated as a combination of headers and footers and the body of the document. Algorithms for its verification were developed for this structure of the document: an algorithm for analyzing headers and footers, an algorithm for analyzing paragraphs of the main text, and an algorithm for updating style settings by the administrator. .Net, WPF, DocumentFormat.OpenXml technologies were used to implement the algorithms by software. Using DocumentFormat.OpenXml allows you to analyze styles in .doc/.docx format documents; the developed program accepts .doc or .docx format files as input and analyzes them for compliance with specified styles. The result of the analysis is returned in .txt or .doc/.docx format, indicating the detected deviations from the standards. The .txt format file is a list of found deviations, and in the .doc/.docx format files, the deviations are recorded in the form of comments to the original text. Using the program simplifies the process of checking documents, it allows you to identify all deviations from standards and reduce the time and resources spent on checking. .Net and WPF technologies were used to develop the user interface. The developed program was checked in the process of checking the explanatory notes of real bachelor's and master's qualification theses. The style analysis time was determined; the time does not exceed 3 seconds. The developed program can be useful for automating the process of checking documents, ensuring quality and compliance with the design standards of scientific and technical documentation, scientific and technical publications, and, first of all, in the educational process for checking the design of bachelor's and master's qualification works, as well as various reports.","PeriodicalId":484763,"journal":{"name":"Applied aspects of information technologies","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied aspects of information technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15276/aait.06.2023.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The work provides a solution to the problem of verifying the design (formatting) of scientific and technical documents for compliance with the requirements of regulatory documents (the problem of document verification). The basis of the check is the analysis of the styles of the Word text editor, which are used to design the paragraphs of the document under study. For eachelement of the document (headings, annotations, main text, figures, signatures under figures, list of references and others) a reference style of their design was developed. Together, these styles form the set of allowed styles. There can be many sets of allowed styles, each edition has its own set of styles. Only the administrator has access to each of the sets, which can create new styles, new sets, and edit both individual styles and individual sets. Due to the peculiarities of style parsing, the document is treated as a combination of headers and footers and the body of the document. Algorithms for its verification were developed for this structure of the document: an algorithm for analyzing headers and footers, an algorithm for analyzing paragraphs of the main text, and an algorithm for updating style settings by the administrator. .Net, WPF, DocumentFormat.OpenXml technologies were used to implement the algorithms by software. Using DocumentFormat.OpenXml allows you to analyze styles in .doc/.docx format documents; the developed program accepts .doc or .docx format files as input and analyzes them for compliance with specified styles. The result of the analysis is returned in .txt or .doc/.docx format, indicating the detected deviations from the standards. The .txt format file is a list of found deviations, and in the .doc/.docx format files, the deviations are recorded in the form of comments to the original text. Using the program simplifies the process of checking documents, it allows you to identify all deviations from standards and reduce the time and resources spent on checking. .Net and WPF technologies were used to develop the user interface. The developed program was checked in the process of checking the explanatory notes of real bachelor's and master's qualification theses. The style analysis time was determined; the time does not exceed 3 seconds. The developed program can be useful for automating the process of checking documents, ensuring quality and compliance with the design standards of scientific and technical documentation, scientific and technical publications, and, first of all, in the educational process for checking the design of bachelor's and master's qualification works, as well as various reports.
科技文本文件验证用算法和软件
本工作解决了科技文件设计(格式)是否符合规范性文件要求的验证问题(文件验证问题)。检查的基础是对Word文本编辑器的样式进行分析,并使用它来设计所研究文档的段落。对于文件的每个部分(标题、注释、正文、图表、图表下的签名、参考文献列表和其他部分),开发了一种设计参考风格。这些样式一起构成了允许的样式集。可以有许多套允许的风格,每个版本都有自己的一套风格。只有管理员可以访问每个集合,可以创建新样式、新集合,也可以编辑单个样式和单个集合。由于样式解析的特殊性,文档被视为页眉、页脚和文档主体的组合。针对文档的这种结构开发了验证它的算法:一个用于分析页眉和页脚的算法,一个用于分析主要文本段落的算法,以及一个用于更新管理员。net、WPF、DocumentFormat的样式设置的算法。采用OpenXml技术对算法进行软件实现。使用DocumentFormat。OpenXml允许你分析。doc/。docx格式文档中的样式;开发的程序接受.doc或.docx格式文件作为输入,并分析它们是否符合指定的样式。分析结果以.txt或.doc/.docx格式返回,指示检测到的与标准的偏差。txt格式文件是发现的偏差列表,在。doc/。docx格式文件中,偏差以对原始文本的注释的形式记录下来。使用该程序简化了检查文档的过程,它允许您识别所有与标准的偏差,并减少花费在检查上的时间和资源。net和WPF技术用于开发用户界面。在对实际学士、硕士学位论文注释进行校核的过程中,对开发的程序进行了校核。确定风格分析时间;时间不能超过3秒。所开发的程序可用于自动化审核文件的过程,确保科技文件、科技出版物的设计质量和符合设计标准,首先可用于在教育过程中审核学士和硕士学位作品的设计,以及各种报告。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信