{"title":"The Lexico-Semantic Pattern Extraction Automation Based on the Analysis of Text Corpora","authors":"Vladislav A. Borovin, V. Lanin, L. Lyadova","doi":"10.1109/AICT52784.2021.9620334","DOIUrl":null,"url":null,"abstract":"The need for using English at Russian universities has increased. It makes the ability to write good quality academic texts a necessary skill. Despite the existence of various types of software which can check grammar and/or style of a text, there is no software focusing on linguistic characteristics of academic texts. The academic community accumulated knowledge is to be used to develop the software that is able to assess an academic text against a set of criteria, i.e. academic discourse markers, selected from academic style guides, handbooks and research articles. At the basis of the proposed approach is creating a repository of patterns which are used to extract the academic discourse markers. To build sufficiently accurate and most suitable patterns, it is necessary to analyze a corpus of scientific publications, which is a time-consuming task. The software for the lexico-semantic pattern extraction automation based on the analysis of text corpora is described. The results of experiments with developed software are presented.","PeriodicalId":150606,"journal":{"name":"2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICT52784.2021.9620334","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The need for using English at Russian universities has increased. It makes the ability to write good quality academic texts a necessary skill. Despite the existence of various types of software which can check grammar and/or style of a text, there is no software focusing on linguistic characteristics of academic texts. The academic community accumulated knowledge is to be used to develop the software that is able to assess an academic text against a set of criteria, i.e. academic discourse markers, selected from academic style guides, handbooks and research articles. At the basis of the proposed approach is creating a repository of patterns which are used to extract the academic discourse markers. To build sufficiently accurate and most suitable patterns, it is necessary to analyze a corpus of scientific publications, which is a time-consuming task. The software for the lexico-semantic pattern extraction automation based on the analysis of text corpora is described. The results of experiments with developed software are presented.