Slovenscina 2.0最新文献_第2页

Praktični vidiki uporabe podbesednih enot v strojnem prevajanju slovenščina-angleščina 在斯洛文尼亚语到英语的机器翻译中使用子词单元的实用方面

Slovenscina 2.0 Pub Date : 2023-09-12 DOI: 10.4312/slo2.0.2023.1.275-301

Gregor Donaj, Mirjam Sepesy Maučec

{"title":"Praktični vidiki uporabe podbesednih enot v strojnem prevajanju slovenščina-angleščina","authors":"Gregor Donaj, Mirjam Sepesy Maučec","doi":"10.4312/slo2.0.2023.1.275-301","DOIUrl":"https://doi.org/10.4312/slo2.0.2023.1.275-301","url":null,"abstract":"Večina sodobnih sistemov za strojno prevajanje temelji na arhitekturi nevronskih mrež. To velja za spletne ponudnike strojnega prevajanja, za raziskovalne sisteme in za orodja, ki so lahko v pomoč poklicnim prevajalcem v njihovi praksi. Čeprav lahko sisteme nevronskih mrež uporabljamo na običajnih centralnih procesnih enotah osebnih računalnikov in strežnikov, je za delovanje s smiselno hitrostjo potrebna uporaba grafičnih procesnih enot. Pri tem smo omejeni z velikostjo slovarja, kar zmanjšuje kakovost prevodov. Velikost slovarja besednih enot je še posebej pereč problem visoko pregibnih jezikov. Rešujemo ga z uporabo podbesednih enot, s katerimi dosežemo večjo pokritost jezika. V članku predstavljamo različne metode razcepljanja besed na podbesedne enote z različno velikimi slovarji in primerjamo njihovo uporabo v strojnem prevajalniku za jezikovni par slovenščina-angleščina. V primerjavo vključujemo še prevajalnik brez razcepljanja besed. Predstavljamo rezultate uspešnosti prevajanja z metriko BLEU, hitrosti učenja modelov in hitrosti prevajanja ter velikosti modelov. Dodajamo pregled praktičnih vidikov uporabe podbesednih enot v strojnem prevajalniku, ki ga uporabljamo skupaj z orodji za računalniško podprto prevajanje.","PeriodicalId":36888,"journal":{"name":"Slovenscina 2.0","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135878152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spremljevalni korpus Trendi in avtomatska kategorizacija 配套语料库趋势和自动分类

Slovenscina 2.0 Pub Date : 2023-09-12 DOI: 10.4312/slo2.0.2023.1.161-188

Iztok Kosem, Jaka Čibej, Kaja Dobrovoljc, Taja Kuzman, Nikola Ljubešić

引用次数: 0

Teoretska izhodišča in metodološki okvir pri izdelavi uporabnikom prijaznega spletišča: primer platforme SMeJse – Slovenščina kot manjšinski jezik 创建用户友好网站的理论起点和方法框架：SMEJse平台的一个例子——斯洛文尼亚语是少数民族语言

Slovenscina 2.0 Pub Date : 2017-12-30 DOI: 10.4312/SLO2.0.2017.2.85-112

Matejka Grgič

{"title":"Teoretska izhodišča in metodološki okvir pri izdelavi uporabnikom prijaznega spletišča: primer platforme SMeJse – Slovenščina kot manjšinski jezik","authors":"Matejka Grgič","doi":"10.4312/SLO2.0.2017.2.85-112","DOIUrl":"https://doi.org/10.4312/SLO2.0.2017.2.85-112","url":null,"abstract":"This paper aims to present some theoretical and methodological issues related to the online portal SLOVENSCINA KOT MANJSINSKI JEZIK – SMeJse / SLOVENIAN AS A MINORITY LANGUAGE – SMiLe where existent tools, materials and information for the development of linguistic skills and abilities in Slovenian are collected. The platform was established by SLORI – Slovenski raziskovalni institut / Slovenian research institute of Trieste, Italy, and the Dijaski dom S. Kosovela / Slovenian student’s center of Trieste, Italy. The purpose of the portal is to stimulate different usages of the current Slovenian language in the Slovenian-Italian contact area, particularly in Italy, with the aim of assuring high communication proficiency in all kinds and varieties of the Slovenian language (the so called “equilingualism”), a balanced bilingualism and also the development of lects, still within the Slovenian linguistic continuum.Specific language policies are particularly successful for the development of linguistic skills which enable proficiency in the minority language, as well as equilingualism and balanced bilingualism among the speakers of the minority group. Such policies are based on the implementation of measures for an increased exposure to different language uses and on the creation of the need of language use in circles and situations where compensatory strategies are unsuitable. The portal is based on the newest linguistic, sociolinguistic and psycholinguistic studies concerning the Slovenian language in Italy, on the Slovenian-Italian language contact and on the acquisition of the minority language. An analysis of the status of the Slovenian language in Italy, its perception and its phenomena, as well as the overview of some language policies and methodological frames, has shown a gap between the existent tools and the needs of the community of speakers.","PeriodicalId":36888,"journal":{"name":"Slovenscina 2.0","volume":"5 1","pages":"85-112"},"PeriodicalIF":0.0,"publicationDate":"2017-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70585811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Omogočanje dostopa do korpusov slovenskih spletnih besedil v luči pravnih omejitev 根据法律限制允许访问斯洛文尼亚在线文本语料库

Slovenscina 2.0 Pub Date : 2016-09-27 DOI: 10.4312/SLO2.0.2016.2.189-219

T. Erjavec, Jaka Čibej, Darja Fišer

引用次数: 2

The value of the Janes corpus for Slovenian language standardization 简氏语料库对斯洛文尼亚语标准化的价值

Slovenscina 2.0 Pub Date : 2016-09-27 DOI: 10.4312/slo2.0.2016.2.1-37

Špela Arhar Holdt, K. Dobrovoljc

{"title":"The value of the Janes corpus for Slovenian language standardization","authors":"Špela Arhar Holdt, K. Dobrovoljc","doi":"10.4312/slo2.0.2016.2.1-37","DOIUrl":"https://doi.org/10.4312/slo2.0.2016.2.1-37","url":null,"abstract":"The main objective of this article is to assess the value of the Janes corpus for research in the field of language standardization. Unlike the existing reference corpora of written Slovenian, the newly available Janes corpus of user-generated content mostly consists of texts that have not been modified by a proofreading expert; it therefore offers a more realistic insight into the trends of language use, as well as the intuitiveness of existing language rules, within a wider language community. We illustrate this methodological potential in a case study of nominal phrases with nonagreeing premodifiers, such as solo petje and RTV prispevek, by comparing their usage in Janes and the reference Kres corpus. The results reveal: this type of phrases is used more often in Janes and includes a longer list of candidates than in Kres; both corpora include a large number of phrases with variant spelling as either one or two words, irrespective of the premodifier in question; and, somewhat surprising, Janes displays a more consistent language use, suggesting that prescriptive regulation actually increases the level of inconsistency in language use. The article, a revised and enhanced extension of a prior conference paper, concludes with a discussion on possible future approaches to this linguistic issue and advocates for inclusion of Janes into Slovenian language standardisation methodology.","PeriodicalId":36888,"journal":{"name":"Slovenscina 2.0","volume":"4 1","pages":"1-37"},"PeriodicalIF":0.0,"publicationDate":"2016-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70585761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

(Re)standardization in the Vice of National Identity: the Cases of Croatian, Serbian, Bosnian, and Montenegrin (再)民族认同副中的标准化:克罗地亚、塞尔维亚、波斯尼亚和黑山的案例

Slovenscina 2.0 Pub Date : 2015-12-01 DOI: 10.4312/slo2.0.2015.2.67-94

Vesna Požgaj Hadži, Tatjana Balažic Bulc

引用次数: 1

Internet Slovene Research Summer Camp for Secondary School Pupils 斯洛文尼亚中学生研究夏令营

Slovenscina 2.0 Pub Date : 2015-12-01 DOI: 10.4312/slo2.0.2015.1.59-61

Darja Fišer

引用次数: 0

Collocations and examples of use: a lexical-semantic approach to terminology 搭配和用法的例子:词汇-语义方法的术语

Slovenscina 2.0 Pub Date : 2014-12-01 DOI: 10.4312/SLO2.0.2014.1.41-61

N. Logar, P. Gantar, Iztok Kosem