{"title":"Information Potential of a Corpus of Scientific Texts","authors":"V. Komaritsa","doi":"10.31432/1994-2443-2023-18-4-21-37","DOIUrl":null,"url":null,"abstract":"The article considers publicly available corpus of texts presented in the internet, characterises and considers the potential of corpus linguistics for analysing the development of scientific trends, discourse and changes in the field of terminology. A dataset based on a corpus of texts of scientific articles in a petroleum transport trade journal and the Google Books Corpus is presented. The dataset allows us to examine changes in term usage frequencies from 1940 to 2019.The results of analyses of term usage frequencies are presented, and a comparison is made between changes in the technology industry and the development of key vocabulary. The results show that studies made using data from corpuses of scientific and technical texts have good potential for understanding trends in technological development and the dynamics of change in industry and terminology.","PeriodicalId":313815,"journal":{"name":"Information and Innovations","volume":"48 16","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Innovations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31432/1994-2443-2023-18-4-21-37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The article considers publicly available corpus of texts presented in the internet, characterises and considers the potential of corpus linguistics for analysing the development of scientific trends, discourse and changes in the field of terminology. A dataset based on a corpus of texts of scientific articles in a petroleum transport trade journal and the Google Books Corpus is presented. The dataset allows us to examine changes in term usage frequencies from 1940 to 2019.The results of analyses of term usage frequencies are presented, and a comparison is made between changes in the technology industry and the development of key vocabulary. The results show that studies made using data from corpuses of scientific and technical texts have good potential for understanding trends in technological development and the dynamics of change in industry and terminology.