Structural Models of Terminological Word Combinations for Marking up a Corpus of Scientific and Technical Texts

I. Butenko, N. S. Nikolaeva, T. Margaryan
{"title":"Structural Models of Terminological Word Combinations for Marking up a Corpus of Scientific and Technical Texts","authors":"I. Butenko, N. S. Nikolaeva, T. Margaryan","doi":"10.25205/1818-7935-2021-19-3-45-56","DOIUrl":null,"url":null,"abstract":"The article presents structural models of terminological phrases from the subject area “Welding” as the basis for creating automated tools to mark up the corpus of scientific and technical texts. The place of scientific and technical corpora in corpus linguistics and the prospects for their further research are outlined. The relevance of the research stems from the need to create corpora of scientific and technical texts in general and to provide tools for automatic detection of terms in particular. It is substantiated that the main problem in designing such corpora is the automatic markup of terminological phrases. The analysis of the current state of the term system of the subject area “Welding” has been carried out. The results of the analysis of two-, three-, four- and five-component terminological phrases of “Welding” and their structural models are presented and illustrated by examples. The necessity of listing all possible structural models of terminological combinations has been substantiated too. It has been established that the addition of a new component to the basic terminological combination most often occurs with introduction of one more postpositional at-tribute whose function is to add some specific feature to the basic meaning. The novelty of the study is seen in providing a theoretical approach for the formation of a database of structural models of terminological phrases which may be used as a core of a supersource database on the structure of the multicomponent scientific and technical terms. An approach to automatic markup of multicomponent terms is proposed too. It will be also helpful in future corpus research for identification of candidate word combinations as scientific and technical terms.","PeriodicalId":434662,"journal":{"name":"NSU Vestnik. Series: Linguistics and Intercultural Communication","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NSU Vestnik. Series: Linguistics and Intercultural Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25205/1818-7935-2021-19-3-45-56","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The article presents structural models of terminological phrases from the subject area “Welding” as the basis for creating automated tools to mark up the corpus of scientific and technical texts. The place of scientific and technical corpora in corpus linguistics and the prospects for their further research are outlined. The relevance of the research stems from the need to create corpora of scientific and technical texts in general and to provide tools for automatic detection of terms in particular. It is substantiated that the main problem in designing such corpora is the automatic markup of terminological phrases. The analysis of the current state of the term system of the subject area “Welding” has been carried out. The results of the analysis of two-, three-, four- and five-component terminological phrases of “Welding” and their structural models are presented and illustrated by examples. The necessity of listing all possible structural models of terminological combinations has been substantiated too. It has been established that the addition of a new component to the basic terminological combination most often occurs with introduction of one more postpositional at-tribute whose function is to add some specific feature to the basic meaning. The novelty of the study is seen in providing a theoretical approach for the formation of a database of structural models of terminological phrases which may be used as a core of a supersource database on the structure of the multicomponent scientific and technical terms. An approach to automatic markup of multicomponent terms is proposed too. It will be also helpful in future corpus research for identification of candidate word combinations as scientific and technical terms.
科技语料库中术语词组合的结构模型
本文提出了主题领域“焊接”术语短语的结构模型,作为创建自动标记科技文本语料库工具的基础。概述了科技语料库在语料库语言学中的地位及其进一步研究的前景。这项研究的相关性源于需要创建一般科学和技术文本的语料库,特别是提供自动检测术语的工具。事实证明,设计此类语料库的主要问题是术语短语的自动标注。对“焊接”学科领域术语体系的现状进行了分析。对“焊接”的二、三、四、五组分术语及其结构模型进行了分析,并举例说明了分析结果。列出所有可能的术语组合结构模型的必要性也得到了证实。已经确定的是,在基本术语组合中增加一个新成分最常发生在引入另一个后置贡品时,其功能是在基本意义上增加一些特定的特征。本研究的新颖之处在于为术语短语结构模型数据库的建立提供了理论途径,该数据库可作为多成分科技术语结构超级数据库的核心。提出了一种多成分术语的自动标注方法。为今后的语料库研究提供科学技术术语候选词组合的识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信