COMPLEX METHOD FOR AUTOMATIC RECOGNITION OF NATURAL LANGUAGE AND EMOTIONAL STATE

I. Iosifov
{"title":"COMPLEX METHOD FOR AUTOMATIC RECOGNITION OF NATURAL LANGUAGE AND EMOTIONAL STATE","authors":"I. Iosifov","doi":"10.28925/2663-4023.2023.19.146164","DOIUrl":null,"url":null,"abstract":"Current trends in NLP emphasize universal models and learning from pre-trained models. This article explores these trends and advanced models of pre-service learning. Inputs are converted into words or contextual embeddings that serve as inputs to encoders and decoders. The corpus of the author's publications over the past six years is used as the object of the research. The main methods of research are the analysis of scientific literature, prototyping, and experimental use of systems in the direction of research. Speech recognition players are divided into players with huge computing resources for whom training on large unlabeled data is a common procedure and players who are focused on training small local speech recognition models on pre-labeled audio data due to a lack of resources. Approaches and frameworks for working with unlabeled data and limited computing resources are almost not present, and methods based on iterative training are not developed and require scientific efforts for development. The research aims to develop methods of iterative training on unlabeled audio data to obtain productively ready speech recognition models with greater accuracy and limited resources. A separate block proposes methods of data preparation for use in training speech recognition systems and a pipeline for automatic training of speech recognition systems using pseudo marking of audio data. The prototype and solution of a real business problem of emotion detection demonstrate the capabilities and limitations of owl recognition systems and emotional states. With the use of the proposed methods of pseudo-labeling, it is possible to obtain recognition accuracy close to the market leaders without significant investment in computing resources, and for languages with a small amount of open data, it can even be surpassed.","PeriodicalId":198390,"journal":{"name":"Cybersecurity: Education, Science, Technique","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cybersecurity: Education, Science, Technique","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28925/2663-4023.2023.19.146164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Current trends in NLP emphasize universal models and learning from pre-trained models. This article explores these trends and advanced models of pre-service learning. Inputs are converted into words or contextual embeddings that serve as inputs to encoders and decoders. The corpus of the author's publications over the past six years is used as the object of the research. The main methods of research are the analysis of scientific literature, prototyping, and experimental use of systems in the direction of research. Speech recognition players are divided into players with huge computing resources for whom training on large unlabeled data is a common procedure and players who are focused on training small local speech recognition models on pre-labeled audio data due to a lack of resources. Approaches and frameworks for working with unlabeled data and limited computing resources are almost not present, and methods based on iterative training are not developed and require scientific efforts for development. The research aims to develop methods of iterative training on unlabeled audio data to obtain productively ready speech recognition models with greater accuracy and limited resources. A separate block proposes methods of data preparation for use in training speech recognition systems and a pipeline for automatic training of speech recognition systems using pseudo marking of audio data. The prototype and solution of a real business problem of emotion detection demonstrate the capabilities and limitations of owl recognition systems and emotional states. With the use of the proposed methods of pseudo-labeling, it is possible to obtain recognition accuracy close to the market leaders without significant investment in computing resources, and for languages with a small amount of open data, it can even be surpassed.
复杂的自然语言和情绪状态自动识别方法
目前NLP的趋势强调通用模型和从预训练模型中学习。本文探讨了职前学习的这些趋势和先进模式。输入被转换成单词或上下文嵌入,作为编码器和解码器的输入。本文以作者近六年出版的文集为研究对象。研究方法主要有科学文献分析、原型设计、系统实验应用等方向的研究。语音识别玩家分为计算资源巨大的玩家,对于他们来说,在大量未标记的数据上进行训练是一个常见的过程,而由于资源缺乏,他们专注于在预标记的音频数据上训练小型的局部语音识别模型。处理未标记数据和有限计算资源的方法和框架几乎不存在,基于迭代训练的方法没有得到发展,需要科学的发展努力。该研究旨在开发对未标记音频数据进行迭代训练的方法,以获得具有更高精度和有限资源的生产性就绪语音识别模型。一个单独的块提出了用于训练语音识别系统的数据准备方法和用于使用音频数据的伪标记自动训练语音识别系统的管道。情感检测的一个实际业务问题的原型和解决方案展示了猫头鹰识别系统和情感状态的能力和局限性。使用本文提出的伪标注方法,可以在不投入大量计算资源的情况下,获得接近市场领先者的识别精度,对于开放数据较少的语言,甚至可以超越。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信