{"title":"Multicomponent English and Russian Terms Alignment in a Parallel Corpus Based on a SimAlign Package","authors":"Yu. I. Butenko","doi":"10.3103/S0005105524700225","DOIUrl":null,"url":null,"abstract":"<p>The article proposes a method for the alignment of multicomponent terminological units in scientific and technical texts placed in English-Russian parallel corpus. The approaches, methods, software, and levels of text alignment in parallel corpora are analyzed. The linguistic peculiarities of English- and Russian-language terminology influencing the process of alignment of special lexicon, as well as the peculiarities of SimAlign package operation in parallel text processing, are investigated. Two approaches to the alignment of multicomponent terms in a parallel corpus are analyzed. The influence of peculiarities of languages with different grammatical characteristics is shown.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":null,"pages":null},"PeriodicalIF":0.5000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.3103/S0005105524700225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The article proposes a method for the alignment of multicomponent terminological units in scientific and technical texts placed in English-Russian parallel corpus. The approaches, methods, software, and levels of text alignment in parallel corpora are analyzed. The linguistic peculiarities of English- and Russian-language terminology influencing the process of alignment of special lexicon, as well as the peculiarities of SimAlign package operation in parallel text processing, are investigated. Two approaches to the alignment of multicomponent terms in a parallel corpus are analyzed. The influence of peculiarities of languages with different grammatical characteristics is shown.
期刊介绍:
Automatic Documentation and Mathematical Linguistics is an international peer reviewed journal that covers all aspects of automation of information processes and systems, as well as algorithms and methods for automatic language analysis. Emphasis is on the practical applications of new technologies and techniques for information analysis and processing.