复杂的自然语言和情绪状态自动识别方法

Cybersecurity: Education, Science, Technique Pub Date : 1900-01-01 DOI:10.28925/2663-4023.2023.19.146164

I. Iosifov

{"title":"复杂的自然语言和情绪状态自动识别方法","authors":"I. Iosifov","doi":"10.28925/2663-4023.2023.19.146164","DOIUrl":null,"url":null,"abstract":"Current trends in NLP emphasize universal models and learning from pre-trained models. This article explores these trends and advanced models of pre-service learning. Inputs are converted into words or contextual embeddings that serve as inputs to encoders and decoders. The corpus of the author's publications over the past six years is used as the object of the research. The main methods of research are the analysis of scientific literature, prototyping, and experimental use of systems in the direction of research. Speech recognition players are divided into players with huge computing resources for whom training on large unlabeled data is a common procedure and players who are focused on training small local speech recognition models on pre-labeled audio data due to a lack of resources. Approaches and frameworks for working with unlabeled data and limited computing resources are almost not present, and methods based on iterative training are not developed and require scientific efforts for development. The research aims to develop methods of iterative training on unlabeled audio data to obtain productively ready speech recognition models with greater accuracy and limited resources. A separate block proposes methods of data preparation for use in training speech recognition systems and a pipeline for automatic training of speech recognition systems using pseudo marking of audio data. The prototype and solution of a real business problem of emotion detection demonstrate the capabilities and limitations of owl recognition systems and emotional states. With the use of the proposed methods of pseudo-labeling, it is possible to obtain recognition accuracy close to the market leaders without significant investment in computing resources, and for languages with a small amount of open data, it can even be surpassed.","PeriodicalId":198390,"journal":{"name":"Cybersecurity: Education, Science, Technique","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"COMPLEX METHOD FOR AUTOMATIC RECOGNITION OF NATURAL LANGUAGE AND EMOTIONAL STATE\",\"authors\":\"I. Iosifov\",\"doi\":\"10.28925/2663-4023.2023.19.146164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Current trends in NLP emphasize universal models and learning from pre-trained models. This article explores these trends and advanced models of pre-service learning. Inputs are converted into words or contextual embeddings that serve as inputs to encoders and decoders. The corpus of the author's publications over the past six years is used as the object of the research. The main methods of research are the analysis of scientific literature, prototyping, and experimental use of systems in the direction of research. Speech recognition players are divided into players with huge computing resources for whom training on large unlabeled data is a common procedure and players who are focused on training small local speech recognition models on pre-labeled audio data due to a lack of resources. Approaches and frameworks for working with unlabeled data and limited computing resources are almost not present, and methods based on iterative training are not developed and require scientific efforts for development. The research aims to develop methods of iterative training on unlabeled audio data to obtain productively ready speech recognition models with greater accuracy and limited resources. A separate block proposes methods of data preparation for use in training speech recognition systems and a pipeline for automatic training of speech recognition systems using pseudo marking of audio data. The prototype and solution of a real business problem of emotion detection demonstrate the capabilities and limitations of owl recognition systems and emotional states. With the use of the proposed methods of pseudo-labeling, it is possible to obtain recognition accuracy close to the market leaders without significant investment in computing resources, and for languages with a small amount of open data, it can even be surpassed.\",\"PeriodicalId\":198390,\"journal\":{\"name\":\"Cybersecurity: Education, Science, Technique\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cybersecurity: Education, Science, Technique\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.28925/2663-4023.2023.19.146164\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cybersecurity: Education, Science, Technique","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28925/2663-4023.2023.19.146164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目前NLP的趋势强调通用模型和从预训练模型中学习。本文探讨了职前学习的这些趋势和先进模式。输入被转换成单词或上下文嵌入，作为编码器和解码器的输入。本文以作者近六年出版的文集为研究对象。研究方法主要有科学文献分析、原型设计、系统实验应用等方向的研究。语音识别玩家分为计算资源巨大的玩家，对于他们来说，在大量未标记的数据上进行训练是一个常见的过程，而由于资源缺乏，他们专注于在预标记的音频数据上训练小型的局部语音识别模型。处理未标记数据和有限计算资源的方法和框架几乎不存在，基于迭代训练的方法没有得到发展，需要科学的发展努力。该研究旨在开发对未标记音频数据进行迭代训练的方法，以获得具有更高精度和有限资源的生产性就绪语音识别模型。一个单独的块提出了用于训练语音识别系统的数据准备方法和用于使用音频数据的伪标记自动训练语音识别系统的管道。情感检测的一个实际业务问题的原型和解决方案展示了猫头鹰识别系统和情感状态的能力和局限性。使用本文提出的伪标注方法，可以在不投入大量计算资源的情况下，获得接近市场领先者的识别精度，对于开放数据较少的语言，甚至可以超越。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

COMPLEX METHOD FOR AUTOMATIC RECOGNITION OF NATURAL LANGUAGE AND EMOTIONAL STATE

Current trends in NLP emphasize universal models and learning from pre-trained models. This article explores these trends and advanced models of pre-service learning. Inputs are converted into words or contextual embeddings that serve as inputs to encoders and decoders. The corpus of the author's publications over the past six years is used as the object of the research. The main methods of research are the analysis of scientific literature, prototyping, and experimental use of systems in the direction of research. Speech recognition players are divided into players with huge computing resources for whom training on large unlabeled data is a common procedure and players who are focused on training small local speech recognition models on pre-labeled audio data due to a lack of resources. Approaches and frameworks for working with unlabeled data and limited computing resources are almost not present, and methods based on iterative training are not developed and require scientific efforts for development. The research aims to develop methods of iterative training on unlabeled audio data to obtain productively ready speech recognition models with greater accuracy and limited resources. A separate block proposes methods of data preparation for use in training speech recognition systems and a pipeline for automatic training of speech recognition systems using pseudo marking of audio data. The prototype and solution of a real business problem of emotion detection demonstrate the capabilities and limitations of owl recognition systems and emotional states. With the use of the proposed methods of pseudo-labeling, it is possible to obtain recognition accuracy close to the market leaders without significant investment in computing resources, and for languages with a small amount of open data, it can even be surpassed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cybersecurity: Education, Science, Technique

自引率

0.00%

发文量