边缘语音AI中用于紧凑语言资源表示的有限状态超转换器

IF 4.4 Q2 AUTOMATION & CONTROL SYSTEMS

Systems Science & Control Engineering Pub Date : 2022-06-23 DOI:10.1080/21642583.2022.2089930

S. Dobrišek, Ziga Golob, Jerneja Žganec Gros

{"title":"边缘语音AI中用于紧凑语言资源表示的有限状态超转换器","authors":"S. Dobrišek, Ziga Golob, Jerneja Žganec Gros","doi":"10.1080/21642583.2022.2089930","DOIUrl":null,"url":null,"abstract":"Finite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction methods are needed. In the paper, we demonstrate that the size of finite-state transducers tends to decrease when the number of word forms in the modelled pronunciation dictionary reaches a certain threshold. Motivated by this finding, we propose and evaluate a new type of finite-state transducers, called ‘finite-state super transducers’, which allow for the representation of pronunciation dictionaries by a smaller number of states and transitions, thereby significantly reducing the size of the language resource representation in comparison to minimal deterministic final-state transducers by up to 25%. Further, we demonstrate that finite-state super transducers exhibit a generalization capability as they may accept and thereby phonetically transform even inflected word forms that had not been initially represented in the original pronunciation dictionary used for building the finite-state super transducer. This method is suitable for speech engines operating on platforms at the edge of an AI system with restricted memory capabilities and processing power, where efficient speech processing methods based on compact language resources must be implemented.","PeriodicalId":46282,"journal":{"name":"Systems Science & Control Engineering","volume":"10 1","pages":"636 - 644"},"PeriodicalIF":4.4000,"publicationDate":"2022-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Finite-state super transducers for compact language resource representation in edge voice-AI\",\"authors\":\"S. Dobrišek, Ziga Golob, Jerneja Žganec Gros\",\"doi\":\"10.1080/21642583.2022.2089930\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Finite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction methods are needed. In the paper, we demonstrate that the size of finite-state transducers tends to decrease when the number of word forms in the modelled pronunciation dictionary reaches a certain threshold. Motivated by this finding, we propose and evaluate a new type of finite-state transducers, called ‘finite-state super transducers’, which allow for the representation of pronunciation dictionaries by a smaller number of states and transitions, thereby significantly reducing the size of the language resource representation in comparison to minimal deterministic final-state transducers by up to 25%. Further, we demonstrate that finite-state super transducers exhibit a generalization capability as they may accept and thereby phonetically transform even inflected word forms that had not been initially represented in the original pronunciation dictionary used for building the finite-state super transducer. This method is suitable for speech engines operating on platforms at the edge of an AI system with restricted memory capabilities and processing power, where efficient speech processing methods based on compact language resources must be implemented.\",\"PeriodicalId\":46282,\"journal\":{\"name\":\"Systems Science & Control Engineering\",\"volume\":\"10 1\",\"pages\":\"636 - 644\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2022-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Systems Science & Control Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/21642583.2022.2089930\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems Science & Control Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/21642583.2022.2089930","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

有限状态换能器已被证明可以产生用于在低资源嵌入式平台上运行的语音引擎中进行字素到音素转换的发音字典的紧凑表示。然而，对于高度屈折的语言，需要更有效的语言资源缩减方法。在本文中，我们证明了有限状态换能器的大小在建模发音字典中的词形数量达到一定阈值时趋于减小。受此发现的启发，我们提出并评估了一种新型有限状态换能器，称为“有限状态超级换能器”，它允许通过更少的状态和转换来表示发音字典，从而与最小确定性最终状态换能器相比，显着减少了语言资源表示的大小，最多减少了25%。此外，我们证明有限状态超级换能器表现出一种泛化能力，因为它们可以接受并因此在语音上转换甚至没有在用于构建有限状态超级换能器的原始发音字典中最初表示的屈折音词形。该方法适用于在内存能力和处理能力有限的AI系统边缘平台上运行的语音引擎，必须实现基于紧凑语言资源的高效语音处理方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Finite-state super transducers for compact language resource representation in edge voice-AI

Finite-state transducers have been proven to yield compact representations of pronunciation dictionaries used for grapheme-to-phoneme conversion in speech engines running on low-resource embedded platforms. However, for highly inflected languages even more efficient language resource reduction methods are needed. In the paper, we demonstrate that the size of finite-state transducers tends to decrease when the number of word forms in the modelled pronunciation dictionary reaches a certain threshold. Motivated by this finding, we propose and evaluate a new type of finite-state transducers, called ‘finite-state super transducers’, which allow for the representation of pronunciation dictionaries by a smaller number of states and transitions, thereby significantly reducing the size of the language resource representation in comparison to minimal deterministic final-state transducers by up to 25%. Further, we demonstrate that finite-state super transducers exhibit a generalization capability as they may accept and thereby phonetically transform even inflected word forms that had not been initially represented in the original pronunciation dictionary used for building the finite-state super transducer. This method is suitable for speech engines operating on platforms at the edge of an AI system with restricted memory capabilities and processing power, where efficient speech processing methods based on compact language resources must be implemented.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Systems Science & Control Engineering AUTOMATION & CONTROL SYSTEMS-

CiteScore

9.50

自引率

2.40%

发文量

审稿时长

29 weeks

期刊介绍： Systems Science & Control Engineering is a world-leading fully open access journal covering all areas of theoretical and applied systems science and control engineering. The journal encourages the submission of original articles, reviews and short communications in areas including, but not limited to: · artificial intelligence · complex systems · complex networks · control theory · control applications · cybernetics · dynamical systems theory · operations research · systems biology · systems dynamics · systems ecology · systems engineering · systems psychology · systems theory