实现了说话人识别软件系统的特点

Computer systems and information technologies Pub Date : 2022-12-29 DOI:10.31891/csit-2022-4-5

Yana Bielozorova, Kateryna Yatsko

{"title":"实现了说话人识别软件系统的特点","authors":"Yana Bielozorova, Kateryna Yatsko","doi":"10.31891/csit-2022-4-5","DOIUrl":null,"url":null,"abstract":"The proposed architecture of the identification software system in the form of class and sequence diagrams. The main criteria for assessing the accuracy of speaker identification were studied and possible sources of loss of speaker identification accuracy were identified, which can be used when building a speaker identification system. A software system based on the proposed architecture and previously developed identification algorithms and methods was created. \nThe following conclusions can be drawn on the basis of the performed research: approaches to the construction of existing announcer identification systems are considered; the main criteria for assessing the accuracy of announcer identification were investigated and the main sources of loss of accuracy during announcer identification were identified; the structural construction of the announcer identification system is considered, taking into account the identified sources of loss of accuracy during announcer identification; the proposed architecture of the speaker identification system in the UML language in the form of class and sequence diagrams; a software system was built that implements the functions of speech signal identification according to the methods and algorithm proposed in previous works. \nThe software system uses a ranking method based on three different criteria. These include: calculation of the proximity of two-dimensional probability density function curves for the frequency of the main tone and the location in the spectrum of three frequency ranges that are extracted from the speech recorded in the speech signal; calculation of the proximity of the probability density function curves for each of these features separately; calculation of the degree of closeness of the absolute maxima of the formant spectra extracted from the speech recorded in the speech signal.","PeriodicalId":353631,"journal":{"name":"Computer systems and information technologies","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FEATURES OF THE IMPLEMENTATION OF THE SPEAKER IDENTIFICATION SOFTWARE SYSTEM\",\"authors\":\"Yana Bielozorova, Kateryna Yatsko\",\"doi\":\"10.31891/csit-2022-4-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The proposed architecture of the identification software system in the form of class and sequence diagrams. The main criteria for assessing the accuracy of speaker identification were studied and possible sources of loss of speaker identification accuracy were identified, which can be used when building a speaker identification system. A software system based on the proposed architecture and previously developed identification algorithms and methods was created. \\nThe following conclusions can be drawn on the basis of the performed research: approaches to the construction of existing announcer identification systems are considered; the main criteria for assessing the accuracy of announcer identification were investigated and the main sources of loss of accuracy during announcer identification were identified; the structural construction of the announcer identification system is considered, taking into account the identified sources of loss of accuracy during announcer identification; the proposed architecture of the speaker identification system in the UML language in the form of class and sequence diagrams; a software system was built that implements the functions of speech signal identification according to the methods and algorithm proposed in previous works. \\nThe software system uses a ranking method based on three different criteria. These include: calculation of the proximity of two-dimensional probability density function curves for the frequency of the main tone and the location in the spectrum of three frequency ranges that are extracted from the speech recorded in the speech signal; calculation of the proximity of the probability density function curves for each of these features separately; calculation of the degree of closeness of the absolute maxima of the formant spectra extracted from the speech recorded in the speech signal.\",\"PeriodicalId\":353631,\"journal\":{\"name\":\"Computer systems and information technologies\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer systems and information technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31891/csit-2022-4-5\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer systems and information technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31891/csit-2022-4-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

以类图和序列图的形式提出了识别软件系统的体系结构。研究了评价说话人识别精度的主要标准，并找出了可能导致说话人识别精度下降的原因，为构建说话人识别系统提供参考。基于所提出的体系结构和先前开发的识别算法和方法，创建了一个软件系统。根据所进行的研究，可以得出以下结论:考虑了现有播音员识别系统的构建方法;研究了评估播音员识别准确性的主要标准，确定了播音员识别过程中准确性损失的主要来源;考虑到播音员识别过程中所识别的误差来源，对播音员识别系统的结构结构进行了研究;用UML语言以类图和序列图的形式提出了说话人识别系统的体系结构;根据前人提出的方法和算法，构建了实现语音信号识别功能的软件系统。该软件系统使用基于三个不同标准的排名方法。这些包括:计算主音频率的二维概率密度函数曲线的接近度以及从语音信号中记录的语音中提取的三个频率范围的频谱位置;分别计算每个特征的概率密度函数曲线的接近度;计算从语音信号中提取的语音峰谱的绝对最大值的接近度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FEATURES OF THE IMPLEMENTATION OF THE SPEAKER IDENTIFICATION SOFTWARE SYSTEM

The proposed architecture of the identification software system in the form of class and sequence diagrams. The main criteria for assessing the accuracy of speaker identification were studied and possible sources of loss of speaker identification accuracy were identified, which can be used when building a speaker identification system. A software system based on the proposed architecture and previously developed identification algorithms and methods was created. The following conclusions can be drawn on the basis of the performed research: approaches to the construction of existing announcer identification systems are considered; the main criteria for assessing the accuracy of announcer identification were investigated and the main sources of loss of accuracy during announcer identification were identified; the structural construction of the announcer identification system is considered, taking into account the identified sources of loss of accuracy during announcer identification; the proposed architecture of the speaker identification system in the UML language in the form of class and sequence diagrams; a software system was built that implements the functions of speech signal identification according to the methods and algorithm proposed in previous works. The software system uses a ranking method based on three different criteria. These include: calculation of the proximity of two-dimensional probability density function curves for the frequency of the main tone and the location in the spectrum of three frequency ranges that are extracted from the speech recorded in the speech signal; calculation of the proximity of the probability density function curves for each of these features separately; calculation of the degree of closeness of the absolute maxima of the formant spectra extracted from the speech recorded in the speech signal.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer systems and information technologies

自引率

0.00%

发文量