Acta Linguistica Academica最新文献_第3页

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-12-12 DOI: 10.1556/2062.2022.00623

Gábor Prószéky

引用次数: 0

A proof-of-concept meaning discrimination experiment to compile a word-in-context dataset for adjectives – A graph-based distributional approach 一个概念验证的意义辨别实验，用于编译形容词的上下文词数据集-基于图的分布方法

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-12-12 DOI: 10.1556/2062.2022.00579

Enikő Héja, Noémi Ligeti-Nagy

引用次数: 0

BiVaSE: A bilingual variational sentence encoder with randomly initialized Transformer layers BiVaSE:一种具有随机初始化Transformer层的双语变分句编码器

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-12-12 DOI: 10.1556/2062.2022.00584

Bence Nyéki

引用次数: 0

Neural machine translation for Hungarian 匈牙利语的神经机器翻译

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-11-30 DOI: 10.1556/2062.2022.00576

L. Laki, Zijian Győző Yang

引用次数: 1

Neural text summarization for Hungarian 匈牙利语的神经文本摘要

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-11-29 DOI: 10.1556/2062.2022.00577

Zijian Győző Yang

引用次数: 0

Cross-lingual transfer of knowledge in distributional language models: Experiments in Hungarian 分布语言模型中知识的跨语言迁移：匈牙利语实验

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-11-22 DOI: 10.1556/2062.2022.00580

Attila Novák, Borbála Novák

引用次数: 0

Winograd schemata and other datasets for anaphora resolution in Hungarian 匈牙利语回指解析的Winograd模式和其他数据集

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-11-22 DOI: 10.1556/2062.2022.00575

Noémi Vadász, Noémi Ligeti-Nagy

引用次数: 3

Principles of corpus querying: A discussion note 语料库查询的原则:讨论笔记

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-11-22 DOI: 10.1556/2062.2022.00581

Bálint Sass

引用次数: 1

PrevDistro: An open-access dataset of Hungarian preverb constructions PrevDistro：匈牙利preverb结构的开放访问数据集

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-11-22 DOI: 10.1556/2062.2022.00578

Ágnes Kalivoda

引用次数: 0

Morphology aware data augmentation with neural language models for online hybrid ASR 基于神经语言模型的在线混合ASR形态学感知数据增强

IF 0.5 3区文学

Acta Linguistica Academica Pub Date : 2022-11-21 DOI: 10.1556/2062.2022.00582

Balázs Tarján, T. Fegyó, P. Mihajlik

{"title":"Morphology aware data augmentation with neural language models for online hybrid ASR","authors":"Balázs Tarján, T. Fegyó, P. Mihajlik","doi":"10.1556/2062.2022.00582","DOIUrl":"https://doi.org/10.1556/2062.2022.00582","url":null,"abstract":"Recognition of Hungarian conversational telephone speech is challenging due to the informal style and morphological richness of the language. Neural Network Language Models (NNLMs) can provide remedy for the high perplexity of the task; however, their high complexity makes them very difficult to apply in the first (single) pass of an online system. Recent studies showed that a considerable part of the knowledge of NNLMs can be transferred to traditional n-grams by using neural text generation based data augmentation. Data augmentation with NNLMs works well for isolating languages; however, we show that it causes a vocabulary explosion in a morphologically rich language. Therefore, we propose a new, morphology aware neural text augmentation method, where we retokenize the generated text into statistically derived subwords. We compare the performance of word-based and subword-based data augmentation techniques with recurrent and Transformer language models and show that subword-based methods can significantly improve the Word Error Rate (WER) while greatly reducing vocabulary size and memory requirements. Combining subword-based modeling and neural language model-based data augmentation, we were able to achieve 11% relative WER reduction and preserve real-time operation of our conversational telephone speech recognition system. Finally, we also demonstrate that subword-based neural text augmentation outperforms the word-based approach not only in terms of overall WER but also in recognition of Out-of-Vocabulary (OOV) words.","PeriodicalId":37594,"journal":{"name":"Acta Linguistica Academica","volume":" ","pages":""},"PeriodicalIF":0.5,"publicationDate":"2022-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49069067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0